Even before Apple Vision Pro was announced, researchers had been studying the effectiveness of what turned out to be Apple's choice of controls for the headset.
As first revealed at WWDC, Vision Pro automatically selects whatever a user is looking at in its display. And then its surrounding cameras detect when that user makes a pinching movement to effectively tap or click on the object.
Saying that "we had no clue" Apple would use this technique, called Gaze and Pinch, researchers conducted experiments on the idea in 2022. Specifically, they compared it to a system where a user first points to an object, using what they called a handray, before pinching to activate it.
In a Twitter thread about the findings, Professor Ken Pfeuffer of Aarhus University, summarized the results.
What's science saying on the UI of the @Apple's new HMD? In 2021-22 we studied the techniques of Gaze+Pinch (#VisionPro) vs Handray (#HoloLens #MetaQuest).
— Ken Pfeuffer (@KenPfeuffer) June 28, 2023
Result: Gaze+Pinch is faster & less physical effort if targets fit the eyetracker accuracy. More details & papers below. pic.twitter.com/UdAdiZg5jy
According to Professor Pfeuffer, "gaze+pinch is faster and less physical effort if targets fit the eye tracker accuracy." In other words, if a headset is precise in how it tracks where a user is looking, this gaze and pinch is the best method of using it.
Perhaps we won't really know until the Vision Pro is on sale, but initial reviewers claim the gaze tracking is good. And Apple has a whole series of patents concerning it, including a 2019 one.
Gaze and Pinch is only better overall, though, because there is at least one case where it can be beaten by the Handray option. Specifically, Professor Pfeuffer's summary of his and his colleagues' work says:
- Speed: Gaze+Pinch (2.5s) > Handray (4.6s)
- Error: Handray (1%) > Gaze+Pinch (3%)
- Preference: Gaze+Pinch (6/16) > Handray (0/16)
So in a test of the time taken to choose an object and select it, the gaze and pinch approach that Apple is using is significantly faster. But then for getting it right, for selecting what is desired, the simpler Handray option wins.
Only, in what's presumably a survey of the people involved in the testing, no one preferred the handray approach. And some 6 of the 16 surveyed did actively prefer gaze and pinch.
The research directly referenced in the Twitter thread is a research paper called "A Fitts' Law Study of Gaze-Hand Alignment for Selection in 3D User Interfaces." An abstract can be read here.
Elsewhere in the discussion, Professor Pfeuffer has also linked to a paper called "PalmGazer: Unimanual Eye-hand Menus in Augmented Reality." That full paper is available to read online.
Apple Vision Pro's shipping date is still some months away, and the company is continuously developing it. But it's unlikely to change something as fundamental as the way a user can use the technology, especially when it's reasonably to assume Apple has conducted similar research in the years it's been working on the device.
3 Comments
I would assume the issue of eye tracking being more error prone has more to do with the implementation of the eye tracking system and just how accurate it is. From what I've heard, Apple's implementation is extremely accurate. So maybe Apple has been able to fine tune it beyond what the researchers were using?
Apple's supposedly using a lot of predicative AI and pre-selection feedback, dwell time, and also is encouraging developers to design hit targets sized appropriately for selection.
As such, I'd expect accuracy to be even higher than that reported here.
Public Service Announcement: a study conducted with a mere 16 participants is unlikely to generate information that can be broadly applied.
Not to mention that it's no surprise that the method requiring less physical and mental effort was the preferred choice. As Steve Jobs once said, convenience always trumps quality.