Amazon, Google, Microsoft and Apple all have voice assistants vying for attention, and all are moving the concept of computer-human voice interaction forward. AppleInsider offers an abbreviated history of how we got here and the capabilities of these assistants and their devices in anticipation of HomePod. What do these things do? What do people use them for? And what is Apple doing differently?
Amazon's Alexa assistant began life on the Amazon Fire Phone. The idea was to have a voice assistant that competed with Siri on the iPhone and iPad. The Fire Phone didn't literally catch fire (unlike some phones) but it may as well have been named for the money Amazon set ablaze in developing and shipping it. The one good outcome for Amazon has been the phoenix that is Alexa, rising from the ashes of the Fire phone.
Alexa and Echo get conflated a lot. Echo is the name for Amazon's line of hardware that runs a voice assistant, and Alexa is the anthropomorphized female voice that talks through it. One of the things that made Alexa such a hit at CES 2018 was the approach Amazon has taken with third-party developers.
Amazon has two separate approaches to developers. The first prong is enabling developers to make skills for Alexa. Skills are programs that enhance the vocabulary and capabilities of Alexa. People use them to create smart home skills, flash briefings, and custom skills that do all kinds of things and can integrate with Alexa's Reminders and shopping lists. The developer kit for building skills is freely available from developer.amazon.com.
The second prong of their approach for developers is making Alexa code available on GitHub for people to run on Raspberry Pi. They do this not because they want a lot of people running it on Pi, although this isn't a bad thing. What they really want is for developers of hardware to incorporate Alexa into their products, like the Ecobee4 smart thermostat, which has Alexa inside. In terms of getting developers to adopt Alexa, this seems to have worked both for getting third-party hardware interest raised and for making it easy for home-brew developers to experiment with writing skills for Alexa.
Tim Cook questioned this strategy, saying, "Competition makes all of us better and I welcome it. (But) if you are both trying to license something and compete with your licensees, this is a difficult model and it remains to be seen if it can be successful or not."
So far, it's worked for Amazon. They claim, without revealing actual numbers, that Echo Dot was No. 1 in sales over the holidays, selling "millions." They were No. 1 in the App Store over the holidays, and the app hit second place on the U.K. App Store, and fifth in Germany and Austria, suggesting strong sales in those markets as well, supporting Amazon's contention.
How Alexa Stacks Up
Alexa has been growing in capability at an astounding pace. There are currently 25,000 skills available for it, although like any app store, only a subset of these are really interesting or useful. It highlights Amazon's success at making it easy to create a skill for Alexa, and getting good reception with developers and people who wish to learn to develop for Alexa.
Most users may never enable a skill, but Amazon has made it easier than ever to use them on Alexa when that time comes. It used to require intentionally saying, "Alexa, enable [skill name]" but with recent updates, this is no longer necessary. It's now possible to just issue a command that uses the skill, and it will enable it first. This works almost perfectly -- except when skill capabilities overlap.
For example, if I wanted to use a skill named "Israel Radio" and listen to a radio station named Galgalatz, I already have the TuneIn skill enabled, so issuing that command will play the radio through TuneIn first, rather than enabling the other skill. The overall good news is that, for smarthome skills, it's just a matter of enabling the skill, saying "Alexa, discover devices" and then using them. You can group devices in the Alexa app on a smartphone or tablet, and it works very well. This is why I've said on the AppleInsider Podcast that Amazon is aiming at a voice first world, that the smartphone is there as a bridge, but all of the main interactions happen directly with the device.
One of the questions when you go voice first is whether it have the ability to understand a range of people. This was one of the early criticisms of Siri. Siri has gotten better over time, but for people who tried it early on and were burned by its inability to grasp a non-American accent, they may not revisit it. Amazon has done a solid job of mastering this, and has the ability to understand accents in the U.S., the UK, Germany, India, Belgium, Croatia, Ecuador, Poland, Ukraine, Canada, Austria, and more -- 89 countries in all. Alexa supports languages including U.S. English, UK English, IN English, German, or Japanese. That is, if you get Alexa in Denmark, you speak English to it, not Danish.
How using the shopping list/reminders works, compatibility with google and iCloud calendars
What do people use Alexa for primarily, anyway?
The top three uses are playing music, asking general questions, and getting the weather. Timers and reminders came in fourth. Amazon lets you use a wide range of music and audio sources for Alexa, including Spotify, Pandora, TuneIn, iHeartRadio, Prime Music, Amazon Music Unlimited, and more via skills.
Amazon provides lists and reminders, and allows them to be managed by voice, by app, and by a web page. Timers and Alarms can only be added and managed by the Echo speaker. Obviously, it's Amazon's intention that you use their own services. But Amazon doesn't provide a calendar, so it's important if you want to ask Alexa what's coming up on your calendar, or issue commands so new appointments are added to the schedule that it be connected with a calendar from another service.
Amazon Alexa supported calendars:
- Apple - Link your iCloud calendar.
- Google - Link your Gmail or G Suite calendar accounts.
- Microsoft Office 365 Calendars - Link your supported company/work calendar. For more information, go to Office 365 Calendars on Alexa.
- Microsoft Outlook.com Calendar - Link personal calendar accounts, including those ending in @outlook.com, @hotmail.com, and @live.com
If you use iCloud calendars, anything added from an Amazon device is then synced with iCloud and added to the Calendar app on iOS and macOS devices.
It's been possible since the spring to control Amazon FireTV devices by voice from a separate smart speaker. This was an early miss that Google actually got right. Like Apple TV, FireTV has a voice-enabled remote with a microphone button, making it possible to search by voice just as you would on Apple TV using the FireTV remote or FireTV app.
This was fine and well, and there's something very convenient about being able to sit down and say, "Hey, Alexa, play 'Stranger Things' from Netflix on FireTV" and have it do just that. Reality is a little different. Saying that command will show Stranger Things information with a large play button, but to actually play it, you have to say, "Hey Alexa, Play on Fire TV" and it will proceed to play the episode. By comparison, Google Home with a Google Chromecast will take the command "Hey Google, play "Stranger Things" from Netflix on my TV" and immediately begin playing. (In that example, the Chromecast is named TV.)
Multimedia control -- trailing Google's efforts, Amazon added controlling FireTV and FireTV Edition Televisions to Alexa's skill set. It's possible to:
- Control Fire TV: "Alexa, [pause, play, resume, stop, fast-forward, rewind] on Fire TV."
- Search movies or TV: "Alexa, search for [movie to TV show title] on Fire TV" or "Alexa, find [movie or TV show title] on Fire TV."
- Find work by a certain actor: "Alexa, show me titles with [actor] on Fire TV."
- Open apps: "Alexa, open [app name] on Fire TV" or "Alexa, launch [app name] on Fire TV."
- Return home: "Alexa, return home."
For Music control:
- Play music: "Alexa, play some music."
- Song information: "Alexa, what's playing?"
- Music controls: "Alexa, play" or "Alexa, next."
- Control music playback on another Alexa speaker: "Alexa, stop in the kitchen" or "Alexa, next in the office."
- Restart song: "Alexa, restart."
- Add a song to your Prime Music library: "Alexa, add this song."
- Like or dislike a song on Pandora and iHeartRadio: "Alexa, I like this song" or "Alexa, thumbs down."
- Start Amazon Music Unlimited trial: "Alexa, start my free trial of Amazon Music Unlimited."
- Play music on other (or multiple) Alexa devices: "Alexa, play [artist] in the living room" or "Alexa, play [artist] everywhere."
- Queue specific song or artist: "Alexa, play music by [artist]."
- Play a song based on context: "Alexa, play the latest Avett Brothers album" or "Alexa, play that song that goes 'Gotta gotta be down, because I want it all.'"
- Play music based on a theme: "Alexa, play baby-making music" or "Alexa, play rock music for working."
- Play the song of the day: "Alexa, play the song of the day."
- Play Spotify music: "Alexa, play [playlist] on Spotify."
- Play Pandora station: "Alexa, play [artist] station on Pandora."
- Play a radio station: "Alexa, play [radio station] on TuneIn."
- Play an audiobook: "Alexa, play [title] on Audible," "Alexa, read [title]" or "Alexa, play the book, [title]."
- Resume the last played audiobook: "Alexa, resume my book."
- Skip audiobook chapters: "Alexa, next chapter" or "Alexa, previous chapter."
- Listen to Alexa read you a Kindle book: "Alexa, read me my Kindle book."
- Set a sleep timer: "Alexa set a sleep timer for 45 minutes" or "Alexa, stop playing in 45 minutes."
Amazon Music Unlimited and Echo are available in 89 countries.
It's also possible to use multiple Alexa devices as communication portals. Calling and messaging between people tends to both ring the speaker and message to the Alexa app on a smartphone. It's akin to Amazon creating their own version of iMessage, but without any real advantages. Where Alexa differs is "drop-ins," or the ability to allow people on your contact list to wake up your speaker and start conversing with you. The concept seems to have originated from simpler times when we could just drop in on each other in some idealistic "drop-by-and-chat-with-friends, ask-the-neighbor-for-a-cup-of-sugar-and-stay-for-hours-to-talk" imaginary world. This seems like something they thought up to goose user interaction and replace the smartphone with voice as the next platform. It's not clear they nailed it.
How Google Home Stacks Up
Google is the last to join the Internet of Things platforms, with Apple HomeKit and Alexa smarthome skills preceding it. Google's wrangled a number of companies on board, and CES 2018 was where it really made sure their commitment to the space was visible.
Google Home does not support calendars outside of Google's own. They have recently given it the ability to work with multiple calendars in addition to a primary one, but they are all-in on their own, homegrown Google calendar. When you set a default calendar for Google Home, that's the default calendar for it to add events, especially if you have multiple users defined with in the device and apps.
If you need iCloud calendars, your best bet is to subscribe to the RSS feed of your iCloud calendar in Google Calendar.
Multimedia control: Google Chromecast, launching shows or music by voice. YouTube Music/YouTube Red subscription required.
Google's support article states, "Currently, you can only use Netflix to play shows and movies on TVs using Google Home. However, you can play YouTube videos (paid content not supported) on TVs using Google Home." Google's support article still says that voice control is Netflix only, but you can now ask Google Home to play content from new partners such as HBO Now, Hulu, CBS All Access, CW, and many more. This also includes support for Google services such as Google Play Movies & TV plus new cloud DVR platform YouTube TV.
Play next episode / previous episode
Note: This functionality is not supported for CW [CWTV]. Crackle supports "Next episode" but not "previous episode." This is something they've got to fix if they intend people to use this feature. It's something that needs to work consistently across all sources.
- "Next episode
- "Previous episode
- Pause/resume/stop "Pause
- "Previous episode
Google Home supports:
- Google Play Music
- YouTube Music (U.S. and Australia only)
- Pandora (U.S. only)
To set the default music service, you have to use the app to make the change. For Amazon, by comparison, you do it by speaking directly to the speaker. One problem Google faces right now is that Amazon Echo is available in 89 countries, while Google Home is available in just 7.
Google, for their part, have the advantage on Amazon of having Google Voice as a property, which allows easy interfacing with the plain old telephone system. Google Home can call phone numbers in the address book out of the box. With a little setup, it can have your correct caller ID, but by default it sends Unknown when placing a call. It does not send text messages, saying, "Sorry, I can't send texts yet."
To set up smarthome integrations with Google Home, you use the Google Assistant application. It's not easy, but it's four taps away once you launch the Google Assistant application. There's a tap in the upper right on an icon that looks sort of like an old inbox/outbox tray, followed by a tap on a ellipses menu icon, followed by a tap on Settings, a tap on Home Control, with a tap on "+" to add a home control integration. Then, select from a list of supported devices and you're off and going.
How Siri / HomePod Stacks Up
Apple claims Siri has 500M users; the company did not provide specifics on what constitutes an active user, or which devices see the most activity; the figure is up from 375M in June 2017 Siri works in 21 languages, localized for 36 countries. This is important, especially as Apple Music is available in 113 countries, 59 countries where Spotify is completely unavailable.
Apple knows how people use Siri, which is one of the things Phil Schiller told Sound and Vision recently, saying, "Voice technologies like Siri are also gaining in popularity with Siri responding to over 2 billion requests each week. This helps us understand how people actually interact with their devices, what they ask, and helps us create a product for the home that makes sense."
When it comes to accessing calendars, notes and lists, Schiller said, " In addition to Siri's deep knowledge of music, Siri understands over a dozen categories, Home as we've discussed, News, Alarms & Timers, Weather, Sports, Messages, and more. We also opened up SiriKit for HomePod, which allows you to use Siri to access your favorite messaging apps or add reminders, notes, and lists to the apps you use on your iPhone. And what's important for all of this is the reason we call these 'domains.' Siri understands these topics deeply and understands what you're looking for even though we all might ask for things in different ways. Siri understands meaning and intention, so it enables a more natural interaction."
That is, Siri on HomePod could potentially access any app that gets on board with SiriKit. Obviously, Apple's own Notes, Reminders, and Calendar support it. SiriKit understands a number of different "domains."
- VoIP Calling - Initiate calls and search the user's call history.
- Messaging - Send messages and search the user's received messages.
- Payments - Send payments between users or pay bills.
- Lists and Notes - Create and manage lists and to-do items
- Visual Codes - Convey contact and payment information using Quick Response (QR) codes.
- Photos - Search for and display photos
- Workouts - Start, end, and manage fitness routines.
- Ride bookings - Book rides and report their status.
- Car telematics - Manage vehicle door locks and get the vehicle's status.
- CarPlay - Interact with a vehicle's CarPlay system.
- Restaurant reservations - Create and manage restaurant reservations with help from the Maps app.
Missing is the obvious Music domain. All of these domains are usable for third party applications, so that you can tell Siri to place a call to a contact name using an application, and it will place the call, for example, "Hey Siri, call Brian Humphris via WhatsApp." Siri requests permission to access WhatsApp data the first time this is used, and then places the call.
Siri lets you know when a service isn't using SiriKit by returning the message, "I wish I could, but [application name] hasn't set that up with me yet." The exception is music, where "Hey Siri, play the Ramones from iHeartRadio" returns "I can't play from iHeartRadio" and prompts to search Apple Music.
We know from Schiller's interview that Apple has been working to make sure Siri's command of the music domain for Apple Music is stronger than Siri's command of music in the past.
"That's why we've worked hard to improve Siri's understanding of music to deliver a more personalized experience. This tight integration of Siri and Apple Music allows HomePod to understand your music tastes and preferences, and lets you tune them by simply saying, 'I like this song' or 'play more like this.' Using the latest advancements in machine learning and AI (artificial intelligence), we're also able to play music based on a particular genre, mood or activity, or a combination of those, so HomePod knows what 'dinner music' sounds like -- for you -- or what you mean when you want to relax," Schiller said.
Apple's recent purchase of Shazam can only help. Shazam is the music recognition company and app that identifies music playing around you. It also may have an impact on Apple's augmented reality plans -- when it recognizes audio is playing, it can supplement that audio with visual cues in AR -- but for HomePod, it's possible that Siri will be able to recognize music not being played on HomePod and take a command to "play more of what I just heard."
Apple Music streams at 256kbps AAC. Some people contend the human ear can't discern differences above this bitrate anyway, and that the Mastered for iTunes program gets as much as you could ever need out of it, but when thinking of improving audio, you do it by fixing each weak link in the audio chain: speaker placement, speakers, amplifiers, interconnects, source audio. HomePod takes care of speaker placement, speakers, amplifiers and interconnects, but source audio has stayed the same. It's not as if Apple doesn't have the lossless source in place. They lack either the desire or the licensing to do something about it.
Thinking about HomePod and the music domain can be done both ways -- presumably, it wouldn't hurt Apple any to interoperate with other music services. Apple used to have a history of making products that interoperated with competitors. Mail is an obvious one. iChat AV integrated with AOL and Google chat. It was Steve Jobs who said, "We have to let go of this idea that for Apple to win, Microsoft has to lose..."
The result is that HomePod is a product that relies on the iPhone, where the original iPod wasn't reliant for very long on the Mac, and could be used by any customer with any computer. That allowed it to become a halo product, one that anyone could buy and use, and that might attract consumers to buy other Apple products.
The other way that we can think about HomePod is instead of thinking about the product first and then music services, we consider it from the music service as the starting point. The customer already owns an iPhone. That same consumer already has an Apple Music subscription, because it's easy, because it comes with a trial with the phone, and inertia is a powerful thing. Because that consumer already has Apple Music, there's really only one choice: HomePod. Apple spends more money on sound quality, which makes the purchase easier to stomach.
For users that have become accustomed to telling Alexa to play radio from TuneIn or a radio station from iHeartRadio, this poses a conflict. If HomePod is the best sounding speaker, does it matter if our music (or radio) isn't in Apple Music? If you're a user focused on the best sound quality and your lossless audio is stored in iTunes, is it treated like a second class citizen because you have to push it via AirPlay to the speaker rather than pull it via a Siri command?
Calling and Messaging
Siri understands calling and messaging on the phone, the watch, and CarPlay, but does not understand them on Apple TV. It will support them at the HomePod, albeit through a connected iPhone. It does not currently understand multiple users by voice as Google Home does.
Siri supports HomeKit and shared homes within HomeKit. The setup here has traditionally been to open the Home App, add an accessory by tapping on a plus symbol in the upper right, scan a QR code or 8-digit numerical code, and the accessory identifies and asks to be added to a room within your Home app's home. In the near future, these devices will also be added via NFC, so you'll just hold up the device to your phone and it will add much more easily. HomeKit isn't controlled by Siri on Mac, Siri on Apple TV, or Siri on CarPlay. It does work from the phone, iPad, and Watch. Siri on HomePod also understands HomeKit instructions.
Privacy is a real concern for users with devices equipped with microphones. It's not a new thing to say, "If you're not paying for the product, you are the product."
Apple's customers buy products, Google's customers buy AdWords. This is an oversimplification given Google's adventures in phones and making smart speakers, but it's true enough that the largest part of Google's income still comes from the ad business. Google does do somethings on behalf of user privacy and security, including building ad-blockers into Chrome browser's latest version, encouraging the implementation of https everywhere. At the same time, it's very clear that users of their products are used to build large data sets which Google can learn from.
Google warns you that if you're going to have this device in a room where you have guests, that you inform them they may be recorded. Google early on had an issue with Google Home Mini where it was recording all the time instead of just the information after the wake word. That was limited to preview hardware sent to reviewers, but it highlights why people might be reasonably nervous about these devices.
"Amazon processes and retains your Alexa Interactions and related information in the cloud in order to respond to your requests (e.g., 'Send a message to Mom'), to provide additional functionality (e.g., speech to text transcription and vice versa), and to improve our services. We also store your messages in the cloud so that they're available on your Alexa App and select Alexa Enabled Products. You or other call participants may be able to ask Alexa to help with certain functions during a call, such as 'Alexa, volume up" and "Alexa, hang up.' Certain Alexa Calling and Messaging services are provided by our third party service providers, and we may provide them with information, such as telephone numbers, to provide those services."
Amazon was asked for logs of what the Echo device heard in the room during a case called the Hot Tub Murder, in which an Arkansas man was accused of killing his friend, a former police officer. Amazon stonewalled for a while, until eventually the accused voluntarily agreed to handing over the data, which Amazon provided the same day. Amazon's objection to providing the data was that the request was too broad. Echo works by listening for the wake word ("Alexa"), and then sending the recording of your voice command to Amazon's servers for processing. Recordings are saved remotely. This is visible to the user when you can see and review your voice requests made to an Echo device in the Alexa app.
Siri, on the other hand, uses local processing. First of all, this means that HomePod is not recording or sending information to Apple until after the "Hey Siri" wake word is positively understood. Once Hey Siri is recognized, the request is sent to Apple's servers using a random ID, rather than an identifier with a user account, like your iCloud account. If you turn Siri off, it deletes those requests, deletes any user data associated with those requests, and has to start learning again when you re-enable Siri.
Apple also uses "differential privacy." The idea is that differential privacy governs how much data can be accessed by a company in the first place, so that the company could be a bad actor and the user could still trust that their information is private. Differential privacy prevents against correlation of data that could identify a user. An article on Apple's Machine Learning Journal says, "It is rooted in the idea that carefully calibrated noise can mask a user's data. When many people submit data, the noise that has been added averages out and meaningful information emerges."
Apple employs differential privacy locally, in device, rather than storing it on a server. The benefit to the user here is that the data is randomized in device before it ever hits a server. Google employs differential privacy in Chrome, but it isn't implemented system wide, not in GBoard, not in Google Home or assistant.
The HomePod is not a multi-user device, and will let you read and send messages by voice. If you're not in the room and your family or roommate is, they can prank you by hearing your messages and then replying with rude responses. There are definitely some practical privacy considerations that the user needs to make that Apple hasn't addressed yet. The question remains, who do you trust -- Apple, Google, or Amazon, and to what degree.
The HomePod is not a Halo product. The iPod was a Halo product. The iPhone is a Halo product. If the HomePod were a Halo product, it would be one you could purchase stand-alone and use out of the box without anything else. And that's sort of true, but with a lot of caveats.
You need an iPhone to set it up. You need Apple Music to kick off music by voice command. You need an iTunes library or other app with music in it on an iOS device in order to stream over AirPlay to it, but AirPlay can be considered a second-class citizen here to using it by voice, if you're interested in a Voice-First interaction paradigm. If you attempt to use it as an AirPlay speaker for Apple TV audio, that's going to work until the moment you send AirPlay audio to it from another device, or play Apple Music through it, and then you'll need to re-establish the connection from the Apple TV.
There's a lot of discussion framing the HomePod as a speaker that happens to be smart (with assistants), versus Google Home and Echo products that are smart assistants that happen to be in the form of speakers. There's some truth to this, although I firmly believe the future is one where these assistants do become a primary form of computing input.
The history of computing is one where adoption has gone up as the interface has gotten easier, from the command line to the graphical interface, to touch, and now to voice. While thinking historically on the matter, we should address iPod Hi-Fi. Apple has retail stores that do the most amount of dollar value in sales per square foot. Apple has always had a section devoted to speakers, selling Bose, Bowers & Wilkins, Harman Kardon/JBL, and even Bang and Olufsen, Libratone, and Devialet.
Apple tried to launch iPod Hi-Fi because they wanted to make a product with a design they could love, and to capture some of those sales. It didn't work. But among the speakers in that space and price segment, only Libratone and Sonos have added smart assistants, so there's still plenty of room for Apple to take a share of the market.
What the HomePod does have going for it is the fact that it's possible to play music from any iOS device with ease, that it will sound, by all accounts, amazing when it does so, and that it's going to work for many languages in many countries, where Alexa and Google Assistant aren't available or aren't available in the primary language. That alone makes it well-positioned for success.