'Computer, on screen!': A look at Google's voice recognition engine

By Tim Conneally | Published June 16, 2009, 3:08 PM

Banner: Test Results

Capt. Picard reads an LCARS printout from Star Trek: First ContactGoogle's voice recognition technology took to the mobile sector with voice-powered search applications for iPhone, Android and BlackBerry. Naturally, Google's own mobile operating system Android has begun to reap special benefits of the powerful technology with some new voice-enabled features.

Yesterday, an unforced update to Android's native Google Maps application endowed the software with speech recognition capabilities. Addresses, business names, and attractions can all be searched by spoken word. The app is now one of several that tap into Google's speech recognition engine, such as the voice-to-text app which recently turned up in the Android Market, simply named Voice Text for Android. That app allows the users to dictate text messages.

While Android is inching toward full voice command, it has certainly not reached the point where natural speech will be easily transcribed. The engine requires a slower and more deliberate vocal cadence, which is easy to get accustomed to, but suffers even when the speaker's diction is conscientious. Rare and odd words are easily detected, but the most common ones consistently suffer, especially if they consist of fricative phonemes.

Of the approximately 42 phonemes in the American English language, fricatives constitute a small but prevalent handful. These include the /f/ as in "Fish" /v/ as in "Very", /th/ as in "thing," and /th/ as in "this." In our tests of the software with three different microphones, these sounds almost always registered incorrectly.

In Maps, a user's potential input is much more rigid than it would be in text message dictation. For example, in Betanews tests yesterday, the tricky statement "think this through" which we chose for its potentially common appearance, its three interdental fricatives and its potentially misspelled homophone (through/threw), stumped the text transcriber a grand total of twenty times, returning results such as "Sync vs. Subaru" and "St. Francis School." Unfortunately, a seemingly simple sentence can become frustrating when words like "then" and "them" can almost never be understood.

Fortunately in Maps, such terms are less likely to turn up.

Another problem in Maps comes up when looking for streets and towns with American Indian or adopted foreign language names which are so common throughout the United States. The town of Hauppauge, New York which many techies know for the computer company of the same name is practically impossible to find through voice search. The engine turned up such things as "Hot Dog New York" and "Paul Blog, Long Island" in our attempts. Attempting such names as "Quaqanantuck Lane," and "Napeague Meadow" sometimes resulted in hilarious misinterpretations.

And this problem is not strictly an American one either. Google said yesterday that the software recognizes North American, British, and Australian English pronunciation, but when our nearby locations can have names in any number of indigenous languages (Whip-ma-Whop-ma-Gate, anyone?), the engine is presented with recurring difficulties.

Comments

View comments by with a score of at least

I used Apple's voice recognition with my Mac OS 8.x machine back in 1997 and 1998 and it almost always worked fine with my mixed accent. I'm surprised that Google isn't doing better, but then, is Apple doing as well on iPhone?

Google might try something much simpler, such as endowing their maps with correct locations. Seeing the marker 5 blocks from where it should be makes life interesting in finding a building.

Score: 0

|

...in finding a building?

Try using it to find the bathroom.

...boy was my neighbor surprised. :p

Score: 0

|

Google rolls out real-time search, Near Me Now, extended personalization

Over time, searches from PCs and mobile phones will grow even "more personalized." But what about user privacy and search results that give you "the truth"?

Intel's marriage of CPU and GPU not ready for prime time

Although there will be an Intel component this month that can compute and plot in parallel, Betanews was told today, it won't be based on Project "Larrabee."

An alternative to Research in Motion's enterprise e-mail? There's an app for that

Good Technology today released an iPhone app compatible with its enterprise e-mail solution.

Playing catch-up in 2010: Windows Mobile, BlackBerry, and Symbian

Microsoft, RIM, and Nokia are each working on improved mobile operating systems. But could these efforts add up to too little, too late?

Windows fix for TLS security bug still forthcoming, won't be Tuesday

Anyone looking for a fix for last month's discovery of a potentially serious security hole in TLS and SSL may have to wait until everyone is ready to act together.

Not the first, not the last, technology predictions for 2010

Carmi Levy | Wide Angle Zoom: The real truth is probably that what went around in 2009, will come around to haunt us next year.

Google Goggles: Hands on with the Shazam of the Real World

Google today unveiled Goggles, its visual search lab for Android devices that identifies objects by sight.

Microsoft: Windows 7 Family Pack wasn't 'pulled,' it just sold out

If you hurry, you may still be able to find the last Family Pack upgrade editions hanging around retail store shelves, but probably not so much online.

Clever iPhone game returns after being bumped over a name dispute

The game's simple concept and multitude of platforms and puzzles manage to pull off a retro, 8-bit style that's reminiscent of an old Atari game given a modern makeover.

Report: Microsoft to randomize Europe's browser screen choices

The fact that "A" is for "Apple" was apparently at the heart of browser vendor objections to Microsoft's alternative to listing IE first.

Will Nokia's plans further alienate American consumers?

A look at Nokia's plans for the coming years does little to shine up the company's increasingly dull image.