Investigation into Siri reveals how the iPhone 4S service talks to AppleA mobile developer investigating Apple's Siri voice assistant has dissected the service's protocol to develop tools for playing with the service outside of the iPhone 4S.
Testing by Applidium notes that iPhone 4S uses standard HTTPS network requests to communicate with Apple's servers, but sends data using an "ACE" command rather than regular web GET requests.
Each Siri request an iPhone 4S makes also involves a unique host identifier that appears to be based on the hardware UUID, preventing unauthorized devices from sending requests to Apple's servers.
Applidium reports some success in copying an iPhone 4S host identifier into requests sent from other devices, including a test Mac environment. By examining how iPhone 4S packages speech recognition requests, the developer was able to send a similarly packaged request and obtain a correct response.
The testing proves that Siri sends raw audio captures of the user's voice, compressed with the Speex audio codec optimized for VoIP. Previously, it had been speculated that iPhone 4S was performing preprocessing of the audio and sending only the results to Apple's servers.
While Siri may perform other preprocessing tasks that use the additional horsepower of the iPhone 4S, Applidium's discovery indicates that any iPhone should be able to support at least Siri's basic voice recognition features, although Apple has indicated that it has no plans to release such capabilities for earlier iOS 5 models, including iPhone 4 and iPhone 3GS.
Both models can support third party speech-to-text services, but Apple offers no way in iOS to integrate such third party services into any app system wide, meaning that users have to dictate into one app and then copy and paste the results elsewhere. Both Android and Windows Phone 7 offer system wide, integrated voice recognition features.
Siri, however, goes far beyond simple voice recognition. Rather than just converting audio to text, Siri evaluates the meaning of requests and maintains a understanding of the user's relationships with specific contacts and a contextual session of the location and other details of a request.
So far, Applidium's investigation has revealed that Siri packages requests in compressed property lists, but further exploration of the protocol is hampered by a number of issues, including the complexity of requests, the fact that they are tied to a hardware key, and that they are subject to change.
Apple could at any time stop responding to a given hardware key were it to suspect that it was being used to exploit its servers; additionally, because the Siri service is proprietary to Apple, the company can change how it transmits data rather quickly by simply sending out a client update.
Applidium says "anyone could now write an Android app that uses the real Siri! Or use Siri on an iPad!" However, in order to access Siri at all, a user would have to sniff out the unique user key of an actual iPhone 4S, and then reuse that key until it expired or was blocked by Apple.
Apple has been perfecting its Siri service as a "beta" feature exclusive to the iPhone 4S, but has experience some downtime in ramping up services to accommodate the demands of millions of users who have rushed to buy the latest iPhone model. The new service also makes use of hardware unique to the iPhone 4S.
It is expected that Siri will eventually find its way to new models of the iPod touch, iPad and perhaps even Macs, with some speculating that it could eventually server as a living room interface employed by Apple TV, doing away with the need for a button or touch-based remote control for navigating television programing.