'Computer, on screen!': A look at Google's voice recognition engine

By Tim Conneally | Published June 16, 2009, 3:08 PM

Banner: Test Results

Capt. Picard reads an LCARS printout from Star Trek: First ContactGoogle's voice recognition technology took to the mobile sector with voice-powered search applications for iPhone, Android and BlackBerry. Naturally, Google's own mobile operating system Android has begun to reap special benefits of the powerful technology with some new voice-enabled features.

Yesterday, an unforced update to Android's native Google Maps application endowed the software with speech recognition capabilities. Addresses, business names, and attractions can all be searched by spoken word. The app is now one of several that tap into Google's speech recognition engine, such as the voice-to-text app which recently turned up in the Android Market, simply named Voice Text for Android. That app allows the users to dictate text messages.

While Android is inching toward full voice command, it has certainly not reached the point where natural speech will be easily transcribed. The engine requires a slower and more deliberate vocal cadence, which is easy to get accustomed to, but suffers even when the speaker's diction is conscientious. Rare and odd words are easily detected, but the most common ones consistently suffer, especially if they consist of fricative phonemes.

Of the approximately 42 phonemes in the American English language, fricatives constitute a small but prevalent handful. These include the /f/ as in "Fish" /v/ as in "Very", /th/ as in "thing," and /th/ as in "this." In our tests of the software with three different microphones, these sounds almost always registered incorrectly.

In Maps, a user's potential input is much more rigid than it would be in text message dictation. For example, in Betanews tests yesterday, the tricky statement "think this through" which we chose for its potentially common appearance, its three interdental fricatives and its potentially misspelled homophone (through/threw), stumped the text transcriber a grand total of twenty times, returning results such as "Sync vs. Subaru" and "St. Francis School." Unfortunately, a seemingly simple sentence can become frustrating when words like "then" and "them" can almost never be understood.

Fortunately in Maps, such terms are less likely to turn up.

Another problem in Maps comes up when looking for streets and towns with American Indian or adopted foreign language names which are so common throughout the United States. The town of Hauppauge, New York which many techies know for the computer company of the same name is practically impossible to find through voice search. The engine turned up such things as "Hot Dog New York" and "Paul Blog, Long Island" in our attempts. Attempting such names as "Quaqanantuck Lane," and "Napeague Meadow" sometimes resulted in hilarious misinterpretations.

And this problem is not strictly an American one either. Google said yesterday that the software recognizes North American, British, and Australian English pronunciation, but when our nearby locations can have names in any number of indigenous languages (Whip-ma-Whop-ma-Gate, anyone?), the engine is presented with recurring difficulties.

Comments

View comments by with a score of at least

I used Apple's voice recognition with my Mac OS 8.x machine back in 1997 and 1998 and it almost always worked fine with my mixed accent. I'm surprised that Google isn't doing better, but then, is Apple doing as well on iPhone?

Google might try something much simpler, such as endowing their maps with correct locations. Seeing the marker 5 blocks from where it should be makes life interesting in finding a building.

Score: 0

|

...in finding a building?

Try using it to find the bathroom.

...boy was my neighbor surprised. :p

Score: 0

|

Google Buzz: Another attempt to harness the content firehose

Similar to how Google successfully remolded RSS into a Google tool, the company now wants to remold Gmail into one big Google party

Success: Google's Nexus One shipping support line takes tech support questions

UPDATED Though the support line had been set up for shipping, it now appears Google personnel are happy to hear technical concerns.

Goodnight, moon: What I learned from a space shuttle

Carmi Levy | Wide Angle Zoom: Can the tech sector learn a few lessons from the space program? Certainly, if you believe in learning from someone else's mistakes.

Netflix to FCC: NBCU + Comcast could bypass net neutrality

Weaning itself from the post office as its main means of video transfer, Netflix would like someone to ensure the Internet remains just as unencumbered.

Rhapsody to become an independent company

RealNetworks and Viacom subsidiary MTV Networks have begun the process of spinning off music service Rhapsody into an independent company.

Nvidia debuts new dynamically-switched graphics card technology

Today, Nvidia announced that its Optimus technology for GPU switching will soon be available in a handful of Asus notebooks.

Google lowers 'unusually high' early termination fee on Nexus One

Google has lowered the Nexus One's early termination fees which were twice as high as the norm.

Netgear and Ericsson introduce a mobile broadband hotspot with a twist

It's a mobile broadband hotspot, but it's for use in the home.

Report: Streaming video drove 72% global increase in mobile data consumption

A new study says streaming video is "the single most influential factor driving the need for increased mobile network capacity."

Stymied by continuing Nexus One 3G issues, Google blames the environment

If you're still afflicted with the 3G flip-flop trouble, then you might consider moving. That appears to be the only suggestion Google can give for now.

Wolfram|Alpha makes a strong argument for virtual keyboards

"Answer engine" Wolfram|Alpha has updated its iPhone/iPod Touch app, harnessing the strength of the virtual keyboard.