Future HomePod may only answer Siri queries if you look at it
Apple device users may not necessarily have to state "Hey Siri" to invoke the digital assistant, with Apple looking into ways to use gaze detection to automatically trigger verbal control of a device, without needing a vocal prompt.
Owners of multiple devices in the Apple ecosystem will be familiar with one of the lesser-known problems of using Siri, namely getting it to work on one device and not another. When you are in a room that contains an iPhone, an iPad, and a HomePod mini, it can be hard to work out which device will actually respond to a query, and it may not necessarily be the desired device at that.
Furthermore, not everyone feels comfortable with the "Hey Siri" prompt being used at all. For example, errant uses of "Hey Siri" and trigger phrases of other digital assistants on television and radio can cause queries to be made that users may not want to take place.
There is also the possibility of users needing to interact with devices without using their voice at all. In situations where a command may need to be issued to hardware from a distance, it may not be possible to use the auditory control scheme, or another control mechanism.
In a patent granted by the US Patent and Trademark Office on Tuesday titled "Device control using gaze information," Apple suggests it may be possible to use a user's gaze to determine if they want assistance from Siri or another system, without requiring the initial verbal prompt.
The filing suggests a system which uses cameras and other sensors capable of determining the location of a user and the path of their gaze, to work out what they are looking at. This information could be used to automatically set the looked-at device to go into an instruction-accepting mode where it actively listens, in the expectation that instructions will be told to it.
If the digital assistant interprets what could be a command in this state, it can then carry it out as if the verbal trigger was said beforehand, saving users from a step. This would still allow for phrases like "Hey Siri" to function, especially when the user isn't looking at the device.
Using the gaze as a barometer for whether the user wants to tell the digital assistant a command is also useful in other ways. For example, gaze detected looking at the device could signal an intent by the user that the device should follow instructions.
In practical terms, this could mean the difference between the device interpreting a sentence fragment such as "play Elvis" as a command or as part of a conversation that it should otherwise ignore.
For owners of multiple devices, gaze detection could allow for an instruction to be made to one device and not others, singled out by looking at it.
The filing mentions that simply looking at the device won't necessarily register as an intention for it to listen for instruction, as a set of "activation criteria" needs to be met. This could merely consist of a continuous gaze for a period of time, like a second, to eliminate minor glances or false positives from a person turning their head.
The angle of the head position is also important. For example, if the device is located on a bedside cabinet and the user is laid in bed asleep, the device could potentially count the user facing the device as a gaze depending on how they lie, but could discount it as such for realizing the user's head is on its side instead of vertical.
In response, a device could provide a number of indicators to the user that the assistant has been activated by a glance, such as a noise or a light pattern from built-in LEDs or a display.
Given the ability to register a user's gaze, it would also be feasible for the system to detect if the user is looking at an object they want to interact with, rather than the device holding the virtual assistant. For example, a user could be looking at one of multiple lamps in the room, and the device could use the context of the user's gaze to work out which lamp the user wants turned on from a command.
Originally filed on August 28, 2019, the patent lists its inventors as Sean B. Kelly, Felipe Bacim De Araujo E Silva, and Karlin Y. Bark.
Apple files numerous patent applications on a weekly basis, but while the existence of a patent indicates areas of interest for its research and development efforts, they do not guarantee the patent will be used in a future product or service.
Being able to remotely operate a device has cropped up in earlier patent filings a few times. For example, a 2015 patent for a "Learning-based estimation of hand and finger pose" suggested the use of an optical 3D mapping system for hand gestures, one that could feasibly be similar in concept to Microsoft's Kinect hardware.
Another filing from 2018 for a "Multi media computing or entertainment system for responding to user presence and activity" proposes the introduction of three-dimensional sensor data to produce a depth map of the room, one which could be used for whole-room gesture recognition.
Gaze detection has also been explored. In March, Apple applied for a patent titled "Gaze-dependent display encryption," one which measured where the user was looking on the screen. In areas of a user's active vision, the display would act normally, but for other areas, it would put false data, preventing observers from being able to quickly peak at a user's documents.