From A Search Engine To A Personalized News Aggregator: The Evolution Of IIIT-H’s First Startup

A business incubator typically helps fledgling startups grow and develop their business. But here’s the story of how the first startup that germinated on IIIT-H campus laid the foundation for CIE, the startup incubator. In conversation with Prof. Vasudeva Varma, Dean (R&D) and CEO of IIIT-H Foundation, who on the occasion of IIIT-H’s 20th anniversary and CIE’s 10th talks of this faculty-backed, very first of many startups that quietly hummed into existence.

It all started at the R&D Showcase event of 2007. As the industry liaisoning officer, Prof. Vasudeva Varma (or Vasu as he’s affectionately called) was the coordinator of the event where typically research from the IIIT-H stable is curated and showcased to industry participants. At the time, along with his first PhD student, Prasad, Professor Varma was involved in the development of a search engine for Indian languages. “Christened Web Khoj, it was a language-independent system that could support many languages”. Rediff, an invited participant who was suitably impressed by it evinced interest in licensing the technology. Since guidelines forbade a direct contract with an academic institution, the first ever company was created on campus with Prasad, the professor and IIIT-H. Prof. Varma remarks that with zero investment, they created the base tech for Indian language search. A need for an incubation centre was felt and CIE was formally founded. With funding in place, Web Khoj quickly scaled up to 200 languages. Prof. Varma speaks of how their tech had an edge over Google’s search engine. “Most Indian languages as well as other global languages such as Arabic, Turkish, Finnish, Dutch and others are ‘agglutinating languages’. This means that words are made up of a string of other words or a combination of words.” To illustrate, the professor says that the word “1000-pillared-temple” is considered a single word in Telugu language. And even the most popular search engine in the world – Google – fails to detect such languages effectively.

Semantic Search

A further refinement of their technology led to the creation of semantic search where the search query would not only understand keywords but also meanings of the key words. And by extension the entire sentence. “For example, Google would fail if you type in a search such as ‘all companies listed in NYSE with annual revenues of > $500mn’,” remarks Prof. Varma. This evolvement in technology led to the creation of SETU, an acronym for Search and Extraction Technologies Unlimited. Professor Varma is quick to add that it was essentially a single company with brilliant ideas in different avatars. Unfortunately they failed to monetize the technology. Professor Varma mentions difficulties such as being physically present in India without an access to the rest of the international market, to name a few. “The truth is that no other search engine other than Google has made money,” he says emphatically.

Veooz

With SETU, the expansion plan consisted of creating demos for using semantic search technology in two domains, namely in education and sentiment analysis. For the education domain, a tablet called MyDrona was created. The way it worked was that it could create relevant and interesting digital content for any syllabus, any class, or curriculum with just a table of content. It was meant to be simple in terms of access as well as usage. “Any person could buy the device, load content into its memory card and use it”, says the professor. It was an innovative technology but the initial commitment to a low price along with mounting hardware costs did not translate to monetized benefits. A refined version of the second demo focussing on sentiment analysis was in the domain of news. At first labelled Veooz (pronounced as ‘Views’), it has been renamed as NewsPlus. Available in the form of a downloadable app, this sophisticated semantic search technology searches the entire Web for news sources in 11 languages. It develops insights on particular news stories that are trending by monitoring all social media platforms and ranking them by their credibility. The professor dubs it as a personalized news aggregator created algorithmically. Addressing the current menace, he says that thanks to its algorithmic strength, it can avoid fake stories that have become commonplace.

Expansion

The technology that the team has been involved with has always been sophisticated and cutting edge. One that saw Srini Koppolu, the first MD of Microsoft joining on board initially as an investor and currently in the role of CEO and Chairman. With its obvious expansion, NewsPlus has since moved away from IIIT-H campus into an office space in Gachibowli. Prof. Varma says that the demographics reveal that it is most popular among 18-32 year-olds. Most users set up their news preferences in two languages, typically their mother tongue and either English or Hindi. While terming news consumption patterns as ever-changing and futuristic, Prof. Varma lists expansion in languages across the world, creating a citizen-journalist platform “for anyone who has an itch to contribute”, and allowing high-quality content creators such as bloggers to plug their content into the NewsPlus platform, as their immediate vision for now.