Affiliate Disclosure
If you buy through our links, we may get a commission. Read our ethics policy.

Apple's custom Neural Engine in iPhone XS about 'letting nothing get in your way'

Apple's insistence on custom design of chips like the Neural Engine in the iPhone XS, XS Max, and XR is about unchaining the company's other designers, according to the lead of its chip architects.

"It's about owning the pieces that are critical and letting nothing get in your way," VP Tim Millet told Wired in an interview published on Tuesday. "The experiences we deliver through the phone are critically dependent on the chip."

Work on the first-generation Neural Engine, which appeared in the iPhone 8, 8 Plus, and X, reportedly began a few years ago with photography in mind. Engineers at the company thought iPhone cameras could be enhanced by machine learning, and some of the initial results included 2017's Portrait Lighting and Face ID technologies.

"We couldn't have done that [Face ID] properly without the Neural Engine," Millet said.

The second-generation Neural Engine in 2018 iPhones can run 5 trillion operations per second, and helps deliver more photo-related features such as the ability to control depth-of-field after a photo was taken, and better augmented reality. Apple is additionally opening up the chip to use by outside developers.

Most non-Apple smartphones use off-the-shelf chip designs from companies like Qualcomm. While those can be powerful and are steadily advancing, Apple's in-house design work has allowed it to build tight hardware/software integration and achieve features that would otherwise have to wait.

Apple has been designing custom chips since the A4 processor used in 2010's iPhone 4, following the takeover of PA Semi. Actual manufacturing was for some time handled by Samsung, but is now thought to be the exclusive domain of TSMC.

The use of custom designs has spread beyond central processors to things like the T2 chip that handles things like the Touch Bar and SSDs in Macs. Some third-party chips remain, like cellular and Wi-Fi.



27 Comments

radarthekat 3904 comments · 12 Years

I have a feeling the ML engine can and will be applied to good effect on creating efficiencies in the dispatching of processes, such that iPhones will be able to perform better even as they age.  Wouldn’t that be a homerun?  Like self-driving cars that all learn from the edge cases encountered by each individual car, perhaps the neural engine can be put to use to evolve faster means of scheduling processes and allocating resources under a myriad of load/usage scenarios, with the most efficient means being preserved into a new generation of experimentation.  It could all be taking place as we simply use our iPhones, reporting back (with each iPhone owner’s permission) successful evolutionary branches.  

ericthehalfbee 4489 comments · 13 Years

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

bigmac2 639 comments · 13 Years


The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

I don't know for sure, but Huawei have a long history of falsifying benchmark and throttling/boosting performance way over the normal TDP of the device to wins numbers war. 

avon b7 8046 comments · 20 Years

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

Soli 9981 comments · 9 Years

If you can reduce security so you can let a Mac boot from USB by changing a setting in macOS Recovery, doesn't that mean the security is easily bypassed? Or, do you have to unlock those settings by first using your system password to unlock the drive?