Apple hiring more Siri engineers, working on evolving API, features, languages
Collecting voices to improve results
Google initiated a telephone based GOOG-411 service in 2007 to provide free, automated, voice-based directory assistance over the phone. After gathering enough data through the service, Google shut the service down last year. Nuance similarly offers a free voice dictation app that provides the company with a way to sample voices. Microsoft, however, is at a disadvantage with Windows Phone because its user base is extremely small and appears not to be gaining any traction in the market, limiting the volume and range of real world samples it can use to improve its service.
While Google has added Voice Actions to Android as a curiosity, Apple's Siri not only signals an intent to deliver an entirely new natural voice interface complementing the multitouch screen of iOS devices, but also threatens Google's middleman status as a search engine. With users performing mobile tasks via apps and with Siri via voice, there's little opportunity left for Google to sell search placement via a conventional page of web results, the area where the company makes most of its money.
Further, Google can't copy Siri on Android without similarly giving up its paid search business model there, and it can't deeply integrate a Siri of its own into iOS, as the only public API for apps on iOS is sandboxed and unable to tightly integrate with other services. This gives Apple a competitive edge with Siri as a hardware maker, rather than solely a licensed platform vendor like Google or Microsoft.
The first one isn't free
Several months prior to Apple's release of Siri in iOS 5, reader Jonathan Truelsen in Denmark reported to AppleInsider that a small team of Americans were recruiting adult native speakers to record common verbal commands in their language, earning a stipend for the 2.5 hour recording session.
The group, promoted via a Facebook event as "the voice project," was not advertised as being sponsored by Apple, but one person that was involved with the project did admit that the company was conducting the sessions, and that the recordings might be incorporated into iOS 5. The recordings were reportedly performed with two iPhone 3G units using Bluetooth headsets.
The commands recorded included checking flight statuses; going to the Internet; checking signal strength; turning on/off bluetooth; finding local companies, restaurants, pubs, cafés; asking scientific questions and history questions; calling phone numbers and contacts and sending an SMS to contact or number. "Sometimes you had make up your own text," Truelsen stated.
After collecting enough paid samples for launch languages, Apple brought Siri to market in German, French and English with variants targeting American, English and Australian accents. Apple plans to expand the languages Siri can understand (including languages like Danish that the company has already apparently paid to sample), but is also working to expand its core functionality, including the range of external services it can query for information.
Apple has already launched at least one external app (Find My Friends) that ties into Siri, even though the app isn't bundled in iOS 5 or shipping on the iPhone 4S. This suggests that the company has future plans to open up Siri to other third parties. Doing so will require user interface integration, however, just as Apple has added Siri-integrated displays and animations that put Notes, Calendars, Contacts, Maps and other connected services right into Siri response page.
At the same time, Apple is also using iPhone 4S users' responses to hone the accuracy of Siri itself. And as it builds server capacity for Siri, it seems likely the company will take its service to other iOS devices and the desktop Mac platform as well, potentially even working Siri into the living room as a voice-based assistant for Apple TV, as some analysts have speculated.
"If the rumors of a speech-enabled Apple TV are true," Maison added, "then Siri will soon have other challenges. For example, far-field speech recognition is notoriously more difficult than with close-talking microphones. She had better take a head start with the iPhone 4S."