Almost two years after Apple introduced FaceTime, Google and Microsoft are battling to introduce their own video chat acquisitions as new Internet web standards, while FaceTime remains proprietary to Apple.
FaceTime, two years later
Apple's Steve Jobs first introduced FaceTime video conferencing in June 2010 at the company's Worldwide Developer Conference as a key new feature of iPhone 4, noting at the time that Apple intended to release the technology as an open specification that other mobile vendors could license to create compatible video conferencing clients.
The company subsequently added FaceTime support to its new camera-bearing iPod touch released that September, announced that FaceTime was coming to the Mac in October (delivering it in February) and brought FaceTime to the new iPad 2 the next spring. FaceTime became so central to Apple's marketing that the company even began calling its iOS and Mac webcams "FaceTime cameras."
Now two years old, Apple's FaceTime feature has never become an openly published standard as Jobs promised it would be. While the technology is based upon a series of open standards, interoperability with third party implementations is not currently possible, for reasons described below.
Google and Microsoft are now wrangling to position two different video conferencing standards (each based on acquired video chat technologies) as open specifications with the potential to rival FaceTime. Google's is named WebRTC, while Microsoft just submitted a proposal named CU-RTC-Web.
Open standards in Apple's proprietary FaceTime
Apple has promoted the fact that it used a series of open protocols to create FaceTime: it incorporates the International Standard Organization's MPEG AAC and H.264 audio and video codecs; support for the IETF's (Internet Engineering Task Force) SIP (Session Initiation Protocol) for call setup; its RTP (Real-time Transport Protocol) and SRTP (Secure RTP) for encrypted video delivery; and its ICE (Interactive Connectivity Establishment), TURN (Traversal Using Relays around NAT) and STUN (Session Traversal Utilities for NAT) for handling firewalls and NAT (Network Address Translation, which commonly used in home routers to create private IP addresses but which creates a hurdle for video chat clients).
Despite using Internet standards to develop FaceTime, however, the detailed inner workings of the feature have never been published, neither as a privately licensed technology nor as an open standard (with or without associated licensing fees).
Individuals examining the communications FaceTime uses have learned that, among other things, the system uses Apple's own unique security system for user authentication rather than relying upon SIP's built in features.
Before ever starting a videoconferencing session, Apple checks to see if the client device can prove itself as legitimate. If it can't, it gets disconnected. Apple's FaceTime client apps are hardwired to Apple's FaceTime accounts, similar to how Google's Gmail app is optimized to only work with one type of email account: Google's.
Apple's FaceTime authentication relies upon on a client-side security key cryptographically signed by Apple, making it as impossible to create an unauthorized third party FaceTime client as it would be to create a third party App Store accessible to other iPhone users (or to force Gmail to access email directly from Microsoft's Exchange Server).
Apple's "walled garden" has walls, but it's also a garden.
In addition to having secret technological elements, FaceTime is also tied into Apple's Push Notification Server infrastructure, meaning that calls between FaceTime clients have to interact with the company's servers in order to contact other FaceTime users. In part, this is required to bridge the gap between telephone networks and Internet devices, something that existing video-telephony or PC-based video chat systems never really attempted.
As a result, FaceTime's proprietary, centralized infrastructure is conceptually similar to the instant messaging services offered by AOL, Microsoft or Yahoo, as opposed to an entirely open network like Internet email, where any vendor can set up servers that can deliver messages to other servers using established, documented protocol specifications, with no intermediary governing the entire system.
The downside to this design is that FaceTime users can't video chat with Android or Windows PC users, because third parties can't reverse engineer their own FaceTime-compatible clients, just as Android developers can't create an unauthorized client to pose questions to Siri. In both cases, Apple's servers demand security credentials and refuse connections if they aren't provided.
The upside is that FaceTime users can't be inundated with spam video-call requests or robo-calls, spoofed incoming call requests pretending to be another party, or have their calls intercepted by spies, the same way that email spam, spoofs and snooping are commonplace and difficult to guard against.
Open to whom?
Apple could certainly license FaceTime to other vendors, just as it currently licenses its proprietary, securely authenticated software protocols such as AirPlay (for a fee) or as it provides secure, encrypted access to app signatures for third party developers (as cheaply as free). So far, Apple has demonstrated no public interest in doing this (apart from Job's original promise that it eventually would).
In order to make FaceTime freely open as an interoperable technology standard, the same way Messages on iOS or OS X (née iChat) interoperates with any XMPP instant messaging system (such as Google Talk, Facebook, any other Jabber IM server), Apple would have to relax its authentication system to allow users to sign into and authenticate with other videoconferencing providers. This would open Pandora's Box to the spam market, just like email.
Currently, there are no real alternative video conferencing services that work similar enough to FaceTime that Apple could allow its users to connect with (certainly in part because portions of FaceTime are still a secret). But there are also far fewer potential vendors for Apple to work with than there were in 2010 when Jobs announced his plans to open FaceTime.
Since Apple's late 2010 release of FaceTime, the collapse of RIM, Palm, Nokia and Windows Mobile has dramatically shifted the mobile playing field, leaving Apple with no other significant mobile vendor to license its Internet-standards based FaceTime protocol to apart from Google's Android or (charitably) Microsoft's Windows Phone, both of which are now pursing their own competitive video chat systems.
On page 2 of 3: Microsoft's Skype, Google offers WebRTC as a FaceTime alternative
Microsoft's Skype
In May 2011, Microsoft acquired Skype for $8.5 billion to aggressively enter the video conferencing market. However, Skype is fundamentally, technologically different from FaceTime, using a wholly proprietary system of call setup, authentication and peer-to-peer transmission that is not even remotely compatible with FaceTime on any level. Trying to integrate the two would be as difficult as playing a Nintendo DS game cartridge in your car's CD player.
Microsoft already offers a third party iOS client for Skype (which existed before its acquisition), so both iOS and Mac users can use Skype, they just need to use a separate app to do so. More importantly, Skype users can't call FaceTime clients, and vice-versa.
Apple hasn't shown any interest in connecting to other non-standard video conferencing systems within FaceTime. It originally supported AIM's proprietary video chats within iChat AV on the desktop, but never extended support for AOL's proprietary video IM system to iOS. It has since lost interest in promoting interoperability with AIM users.
Apple also supported Google Talk within iChat on the Mac (via the open nature of XMPP/Jabber), but not for video conferencing from iOS. Google has also changed its strategic direction with Google Talk video conferencing, as noted below.
Google offers WebRTC as a FaceTime alternative
As with Microsoft, rather than expressing interest in FaceTime, Google has focused upon its own investment of $68.2 million in Global IP Solutions, which it announced the intention to acquire in May 2010, just weeks before Apple unveiled FaceTime. Google completed the acquisition in January 2011.
After acquiring GIPS, Google began shifting its Google Talk video conferencing strategy to the web-based, JavaScript implementation of SIP that GIPS had developed. Google calls the technology WebRTC, and introduced it to the W3C as a proposed open specification for web-based video conferencing that does not require a plugin (as Google's previous web video chat did) in June 2011.
In contrast, Apple has never relied on the web to deliver video conferencing. It has always used on native apps, first iChat on the Mac, then FaceTime on iOS, and then FaceTime on Mac. Apple has since renamed iChat on the Mac to Messages, which it maintains separately from FaceTime.
Google took GIPS's voice-optimized audio codecs (iSAC and iLBC) and added its own WebM video codec (née VP8) to deliver WebRTC as part of its 2010 strategy to replace H.264 web videos with its own WebM codec.
While WebRTC can technically be made to work with any video codec, including H.264, Google's own clients are naturally designed to use WebM, so any client that wants to work with Google's would have to support Google's WebM. That's of course something Apple has no interest in doing, because its iOS devices have no support for WebM hardware acceleration.
Same same, but different
Technologically, WebRTC isn't nearly as different from FaceTime as Microsoft's Skype is. Both FaceTime and WebRTC are based on SIP for call setup, use RTP for video delivery, and rely upon ICE, TURN and STUN for handling firewalls and NAT.
However, Google's WebRTC is essentially an experimental open source project being offered to web developers, not a finished product like FaceTime. This makes it more akin to Google Wave compared to Apple's Mail app: one is a complex technology erector set, the other is an easy to use, finished end-user app.
Of course, the difference is that Mail can be configured to use Google's gmail accounts; there are no equivalent "video conferencing accounts" usable with FaceTime, which is currently hardwired excessively to Apple's authentication and push notification servers.
Given Google's interest in establishing WebRTC as a browser standard and its ongoing (if stalled) efforts to substitute H.264 with WebM, it does not seem at all likely that Google would be interested in working with Apple to develop licensed FaceTime clients for the web, Chrome OS or Android.
Conversely, Apple does not appear to be interested in working with Google to provide interoperability with iOS and Macs using FaceTime, given that it has taken actions to remove Google from iOS 6 Maps, drop its iOS YouTube app, and won't even natively support Google's social networks in OS X Mountain Lion's Share Sheets.
On page 3 of 3: FaceTime, WebRTC not directly comparable; Microsoft proposes CU-RTC-Web
FaceTime, WebRTC not directly comparable
At the same time, experimental WebRTC apps can already be loaded in the web browser of Macs and iOS devices. In fact, WebRTC (just like Google's WebM) is primarily aimed at web users, so it isn't really accurate to think of its as "Android's equivalent to FaceTime."
One primary problem to real world adoption of WebRTC is that Google is working hard to push the use of its acquired codecs, which all lack hardware accelerated support on iOS devices. That means iOS devices aren't optimized to deliver WebM video, and certainly not the high quality video that FaceTime supports via its use of the advanced H.264 codec and its highly efficient on-chip encryption built into every one of the more than 300 million devices running iOS 5.
For Android, Google offers Google Talk video, voice and text chat, which is peer-to-peer technology based on Jabber (like Apple's original iChat AV). The company also provides group video chat for Android within its Google+ app (also available for iOS). This summer, the company announced it would be moving Google Talk to "Google+ Hangouts technology," in an effort to consolidate its offerings.
"Unlike the old [Google Talk] video chat," Google stated in its announcement, "which was based on peer-to-peer technology, Hangouts utilize the power of Googleâs network to deliver higher reliability and enhanced quality."
It's not clear if Google will shut down Google Talk and shift its video conferencing product for Android exclusively to Google+ but this indicates Google is interested in moving to a centralized, authenticated system more like FaceTime, but integrated with "screen sharing and integrated Google Docs collaboration," similar to iChat Theater (something that's not currently supported in FaceTime).
Apple hasn't yet revealed its hand, but it is likely to eventually add iChat Theater features (including support for screen sharing and document collaboration) to FaceTime on iOS, and potentially also bring support for proprietary instant messenger "iChat" features, including Jabber, Yahoo and AIM chat, melding Messages with FaceTime.
Microsoft proposes CU-RTC-Web as alternative to WebRTC
While Microsoft initially offered some support for the concept of WebRTC, it has recently stated that Google's submission shows "no signs of offering real world interoperability with existing VoIP phones, and mobile phones, from behind firewalls and across routers and instead focuses on video communication between web browsers under ideal conditions."
It has instead proposed its own CU-RTC-Web (Customizable, Ubiquitous Real Time Communication over the Web) specification to the W3C standards body. Microsoft's specification is based on the HTML5 getUserMedia API rather than SIP (which Microsoft claims is less suitable for use in web apps because it doesn't support stateless connections), and is intended to work with more codecs than those Google is pushing.
Unsurprisingly, Microsoft's rival submission is authored by its Skype team, including its principle architect Matthew Kaufman, whom the company describes as the "inventor of RTMFP, the most widely used browser-to-browser RTC protocol on the web."
Kaufman's Real-Time Media Flow Protocol was developed at Adobe as a peer-to-peer media distribution protocol for Flash Media Server before he joined Microsoft to work on Skype.
Microsoft's rival protocol was just introduced this week, but it may have some impact in the formation of a unified specification for video conferencing on the web if Google and other browser vendors see value in it. Microsoft had formerly voiced interest in Google's WebRTC, presumably seeing it as a potential way to deliver Skype within the browser.
Now that Microsoft has splintered off with its own technology proposal, support for Google's WebRTC remains backed by the company's own Chrome, Mozilla's Firefox and the Opera browser, the three parties who also support Google's WebM video codec.
Apple remains notably absent from the discussion surrounding WebRTC, focusing instead on delivering FaceTime via native apps on both iOS and OS X rather than attempting to deliver browser-based video conferencing.
Given that the only other significantly profitable mobile device maker is now Apple's arch-rival Samsung, it does not appear likely that Apple's FaceTime will extend widely beyond the Mac and iOS platforms any time soon, despite Jobs' initial assurance that Apple would aggressively license the technology to promote it as the world's standard for video conferencing.