Overcoming
Language Barriers
By DINESH C. SHARMA


American IT companies focus research on technologies and products specific to the Indian market

Namaskar! Aaj budhwar hai. Yeh ek suhani subah hai," Ashish Verma speaks into a microphone attached to his computer and the words pop up in Devanagari on the wide computer screen. Verma works with the technology center of the IBM India Research Laboratory (IRL) located in New Delhi's Indian Institute of Technology (IIT). The technology he is demonstrating is known as speech recognition. It helps those not familiar with the English language and keyboards to easily use the computer. Right now, the system can recognize 65,000 Hindi words, and this vocabulary will ultimately be increased to 125,000 words.

Speech-to-text is one of the key areas of human-computer interaction research that IRL is tackling, along with electronic commerce, e-governance, bioinformatics, unstructured information management and e-business on demand. Being part of IBM's global research effort, IRL is working on technology areas that are critical for IBM's global corporate and commercial goals. At the same time, it has research projects designed specifically for Indian markets. IBM is not alone in doing this. Hewlett Packard (HP) is engaged in similar activity at its laboratory in Bangalore, where software giant Microsoft too is setting up its research center to develop products tailored for the multilingual Indian environment.

Multiplicity of languages, cultural diversity, low literacy rates, price sensitivity and low usage of personal computers are all challenges to the information technology industry. Because of these factors the digital divide today is more than a cliché-it's a reality. Although India has carved out a niche for itself in the global software and services markets, domestic consumption of IT products and services remains abysmally low.

"HP realizes that a very significant part of its growth is going to come from rapidly developing economies such as China, India, Russia and many economies in central Europe and the ASEAN (Association of Southeast Asian Nations) region. In order to best take advantage of this growth opportunity, we need to innovate for the customers in these markets, because the needs of these markets are unique," says Ajay Gupta, director of HP Labs India in Bangalore. "It is for this reason HP Labs India has been set up. For real innovation to take place, re-searchers need to be thoroughly immersed in the context. HP Labs India is executing on this corporate strategy."

At IRL, researchers have extended IBM's ViaVoice speech recognition technology to develop a system for Hindi and "Indian English." This system understands and transcribes human speech with minimal use of keyboards thereby helping people unfamiliar with computers or the English language. Since there are no standard keyboards available in Indian languages, speech recognition eliminates the need to learn non-standard keyboard mapping. The system has been tested and trained for variations over a large number of speakers from different regions of the country.

"'Indian English' is an interesting phenomenon. There are a lot of sounds that are additional in 'Indian English' and word usage is also different. Pronunciation of Indian names also produces different types of sounds. Then there are regional accents. So, all this requires a certain special level of attention," points out Ponani S. Gopalakrishnan, director of IRL.

HP, IBM and Microsoft are making products and technology more friendly for the average Indian.

In view of the growth of phone banking and inquiry assistance services, the lab is working on a system that can recognize a speaker's voice in Hindi and other Indian languages. This system will use speech as an input rather than digits punched through a telephone keypad used in interactive voice response systems currently. IRL researchers have developed a prototype of an acoustic model for Hindi to decode the speech in response to a given prompt.

Although most Indian languages pose challenges similar to Hindi, IRL does not plan to extend its work to all such languages. "We want to prove feasibility and effectiveness of the system in certain core languages. We will also look at partners from the academic community to come together for this program. A lot of our research agenda is driven by our commercial agenda. It depends on what the marketplace is asking for," says Gopalakrishnan. However, core technologies developed at IRL may be applicable to other emerging markets as well, though the language part remains specific to India.

Like IRL, HP Labs India too is trying to break the English language barrier by exploring the use of handwriting as an input. Millions of forms, like the ones for railway reservations, are filled out every day and in different Indian languages. Therefore, the lab is developing a technology that can recognize handwriting as well as capture image data for further processing. A prototype of such a script-independent device called Script Mail has been developed for sending and receiving handwritten e-mails.

Voice or speech is one of the most prevalent forms of communication. HP Labs India has come up with a telephone-based railway inquiry system providing information on ticket availability and the status of wait-listed tickets for trains running between four Indian metros. In the next stage, online booking may be developed. This system is also accent-independent and works for Hindi and "Indian English." HP Labs took a general-purpose engine for speech recognition and synthesis, generated by a team of researchers worldwide and freely available to the software professional community, and combined it with their own recognition models specific to Hindi and "Indian English." This system too can accept and process a variety of Indian accents and speaking styles.

Yet another way to overcome problems of low literacy is a text-to-speech system that can let users listen to any information rather than reading it off the screen. Such a system can provide voice output in contexts where visual interface may not be appropriate. HP Labs India is developing text-to-speech systems for Hindi and "Indian English."

Microsoft has also trained its research attention on the multilingual Indian puzzle. Its research lab, scheduled to become operational early this year in Bangalore, will develop software technologies that could help content creation, storage, search, access and interaction in multiple languages by deploying natural language processing and speech recognition technologies. Another key research area will be to understand the role of technology in emerging markets including countries in Southeast Asia and Latin America and come up with innovations to meet their special requirements.

All these technology company research labs have strong ties with local academic and engineering communities. IRL is physically located on the IIT Delhi campus and interacts with the institute. HP Labs India has set up a separate laboratory at IIT Chennai, while there is a Microsoft Lab working at IIT Kharagpur. Microsoft hopes that its new research lab in Bangalore will boost its ongoing university relations program.

"One of the unique aspects of Indian researchers is that they are tremendously concerned about the practical and social implications of their work. They're trying to solve immediate problems facing Indian society today," says Mythreyee Ganapathy, who manages Microsoft's university relations program started in 2001. "What is interesting is that some of these problems are universal, and solutions developed by these researchers will be applicable in many other developing countries."

In fact, a strong technical and academic environment is one of the key reasons why American companies set up research labs in India. This allows them to tap into high class local talent as well as provide a channel for IT professionals of Indian origin who want to come back to work in India. Many people working in such labs have come from parent companies in the United States or from other American companies or academic institutions. IRL's Gopalakrishnan, HP's Ajay Gupta and P. Anandan, who will head the Microsoft lab, are examples.

Cost is another major driver of outsourcing R&D to India, said Ashish Gupta, country head of the U.S.-based consultancy Evalueserve. This firm helps American technology companies evaluate India as an R&D destination and tackle issues like location, size and hiring of local talent.

In the past five years, more than a hundred U.S. technology firms have set up their development and engineering centers in Bangalore, Hyderabad, Pune and Gurgaon. They have invested more than one billion dollars so far. And this investment seems to be paying off, judging by the contribution these units make to their parent companies' global product lines and their creation of intellectual property. Over 1,000 patent applications were filed with the U.S. Patent and Trademark Office by the top dozen entities of foreign companies in India from their start till September 2002, according to a survey by Evalueserve. This includes about 120 patents filed by IBM from India. HP Labs said over 20 of its innovations are in the process of patenting.

With Bell Labs' recent announcement of starting an R&D center in Bangalore, the development of products focused on local market demands seems quite vibrant. In the next couple of years, more American IT companies are likely to enter the Indian market.

About the Author: Dinesh C. Sharma is a New Delhi-based science and technology journalist, who writes for Cnet News.com (USA) and The Lancet (U.K.).