'Computer, on screen!': A look at Google's voice recognition engine

By Tim Conneally | Published June 16, 2009, 3:08 PM

Banner: Test Results

Capt. Picard reads an LCARS printout from Star Trek: First ContactGoogle's voice recognition technology took to the mobile sector with voice-powered search applications for iPhone, Android and BlackBerry. Naturally, Google's own mobile operating system Android has begun to reap special benefits of the powerful technology with some new voice-enabled features.

Yesterday, an unforced update to Android's native Google Maps application endowed the software with speech recognition capabilities. Addresses, business names, and attractions can all be searched by spoken word. The app is now one of several that tap into Google's speech recognition engine, such as the voice-to-text app which recently turned up in the Android Market, simply named Voice Text for Android. That app allows the users to dictate text messages.

While Android is inching toward full voice command, it has certainly not reached the point where natural speech will be easily transcribed. The engine requires a slower and more deliberate vocal cadence, which is easy to get accustomed to, but suffers even when the speaker's diction is conscientious. Rare and odd words are easily detected, but the most common ones consistently suffer, especially if they consist of fricative phonemes.

Of the approximately 42 phonemes in the American English language, fricatives constitute a small but prevalent handful. These include the /f/ as in "Fish" /v/ as in "Very", /th/ as in "thing," and /th/ as in "this." In our tests of the software with three different microphones, these sounds almost always registered incorrectly.

In Maps, a user's potential input is much more rigid than it would be in text message dictation. For example, in Betanews tests yesterday, the tricky statement "think this through" which we chose for its potentially common appearance, its three interdental fricatives and its potentially misspelled homophone (through/threw), stumped the text transcriber a grand total of twenty times, returning results such as "Sync vs. Subaru" and "St. Francis School." Unfortunately, a seemingly simple sentence can become frustrating when words like "then" and "them" can almost never be understood.

Fortunately in Maps, such terms are less likely to turn up.

Another problem in Maps comes up when looking for streets and towns with American Indian or adopted foreign language names which are so common throughout the United States. The town of Hauppauge, New York which many techies know for the computer company of the same name is practically impossible to find through voice search. The engine turned up such things as "Hot Dog New York" and "Paul Blog, Long Island" in our attempts. Attempting such names as "Quaqanantuck Lane," and "Napeague Meadow" sometimes resulted in hilarious misinterpretations.

And this problem is not strictly an American one either. Google said yesterday that the software recognizes North American, British, and Australian English pronunciation, but when our nearby locations can have names in any number of indigenous languages (Whip-ma-Whop-ma-Gate, anyone?), the engine is presented with recurring difficulties.

Comments

View comments by with a score of at least

I used Apple's voice recognition with my Mac OS 8.x machine back in 1997 and 1998 and it almost always worked fine with my mixed accent. I'm surprised that Google isn't doing better, but then, is Apple doing as well on iPhone?

Google might try something much simpler, such as endowing their maps with correct locations. Seeing the marker 5 blocks from where it should be makes life interesting in finding a building.

Score: 0

|

...in finding a building?

Try using it to find the bathroom.

...boy was my neighbor surprised. :p

Score: 0

|

A real beta process at work: Mozilla fires up Firefox 3.6 Beta 2

In the clearest sign yet that public input really does help the development process, a flurry of bug detections provoked Mozilla to release Beta 2 of the next Firefox.

Snow Leopard and Windows 7 still can't crack the netbook problem

Apple has killed Atom support in OS X 10.6.2 and Windows 7 Starter Edition is stripped of "basic" functionality.

Microsoft's Top 3 advances in Exchange Server 2010

The latest round of changes launched today will impact how admins deliver services to e-mail recipients, and how much companies will pay along the way.

Firefox turns five: Thanks for giving us a choice

Carmi Levy | Wide Angle Zoom: No longer the phoenix rising from the ashes, Mozilla has carried on more than just Netscape's legacy.

Kindle for PC opens in beta, underwhelms

Amazon has opened the beta of Kindle for PC, a companion to the Kindle, but little else.

European ministers approve watered-down 'neutral net' language

The latest provision in the EU's telecoms regulatory framework would let businesses cancel individuals' Internet access, if they go to court first.

It's the US vs. the EU over Oracle+Sun and the meaning of 'open source'

Now that the EU is a virtual country, the US Justice Dept. is taking a stand in favor of its view -- and against the EC's -- that MySQL will survive under Oracle.

Qualcomm: $1.3 billion Samsung licensing deal unrelated to fair trade violations

Samsung has come to a 15-year licensing deal with Qualcomm over 3G and 4G wireless technology.

Nokia's 'limited number' of recalled chargers exceeds 14 million

Today, the Finnish phone maker has begun a recall of mobile phone chargers that are a shock hazard.

Ubuntu 9.10 upgraders report frustration

For those Wine aficionados out there, beware of the remote possibility that your Linux system could be infected by Windows-seeking malware.

Supreme Court considers patentability of abstract methods today

Can software that executes a formula for a business process qualify for federal patents? An appeals court already said no, and inventors are making their case.