By any account, Apple had a rough year with iOS 11 and macOS High Sierra from a software quality perspective -- but there are lessons to be learned and corrective measures to execute that can only be fully undertaken with the Apple leadership staying in place.
Editor's note: With WWDC upon us, rumors again suggest that Apple will slow the addition of new features, and instead focus on refining the operating system. The lessons learned in the end of 2017 and the beginning of 2018 are worth repeating.
In the span of a month, Apple was rocked by not just one software bug, but several. One of them was one of Apple's most severe security-related flaws yet -- the ability for a user to generate a Root account with the highest level of permissions possible, bypassing most of Apple's protections and security measures.
The fallout of the Root bug resulted in a series of crucial updates delivered through the App Store and automatically -- with their own foibles. Another issue developed on Dec. 2 with the iOS Notifications center, culminating in the what appears to be pre-emptive release of iOS 11.2 on early Saturday morning.
If Apple actually planned the release at that time (which seems unlikely) the Saturday morning release was certainly unparalleled in Apple's software release history.
The editorial and social media hue and cry for the virtual heads of Apple CEO Tim Cook or Senior Vice President Craig Federighi to be served up on a platter has begun. Should that happen, it will only make the situation worse.
Modern hardware is more complex than it has ever been. So many systems inter-relate and are so closely bound, that the user not being fully educated, a malfunctioning system, or software routine not operating well can have massive repercussions.
The nuclear-powered USS Thresher was the namesake of its class when it first put out to sea. It was the fastest, and quietest submarine in the world when it was built, and the most advanced weapons system of its time.
After a nine-month shipyard availability after initial sea trials to hammer out the bugs, the Thresher put out to sea. After a trip to test depth, the vessel was lost with all hands.
After recovery and reconstruction of the disaster, the Navy determined that the failure of a seawater piping system joint caused a cascading failure leading the the loss of the vessel. Simply, the joints in the piping were insufficient to the task, and quality assurance testing didn't spot the problem for a myriad of reasons.
The head of the U.S. Navy Nuclear Power program at the time was still the founder -- Admiral Hyman Rickover.
"I believe the loss of the Thresher should not be viewed solely as the result of failure of a specific braze, weld, system or component, but rather should be considered a consequence of the philosophy of design, construction and inspection that has been permitted in our naval shipbuilding programs," said Rickover. "I think it is important that we re-evaluate our present practices where, in the desire to make advancements, we may have forsaken the fundamentals of good engineering."
This saga of testing, and failure, may seem familiar to Apple fans.
The iPhone and a submarine?
The iPhone isn't a weapons platform, nor were any of the software bugs the cause of any loss of life. However, given modern life's reliance on the device, it can be used as a weapons platform against us.
An insecure Mac or iPhone could be used to surrender authentication methods or reset cloud access passwords. Properly attacked, in theory a bug like the no-password Root access could wipe out a user's entire stored data across iCloud or assorted Google data stores, using Apple's assorted lock and reset methods.
This isn't even including the potential damage from banking information going astray, or other financial information stolen from an attacked user.
The wake of a disaster
Admiral Rickover didn't lose his job because of the Thresher disaster, and it doesn't look like there were any mass-firings at the shipyard that did the maintenance at the time. Firing Admiral Rickover then would have set back Navy nuclear power, possibly never to recover. Instead, as a direct result of the maritime disaster, the U.S. Navy implemented the SUBSAFE quality assurance program. The program was a top to bottom renovation of the submarine supply chain, and parts accountability all the way from the assembly or manufacture of the part to installation.
"The devil is in the details, but so is salvation." - Vice Admiral Hyman G. Rickover.
Since then, the United States hasn't lost a vessel to a material failure. In the same time frame, several other nations have with the Russians having lost six -- but their parts and personnel vetting isn't as strict as the U.S. Navy.
Back to the original point -- Apple needs its own SUBSAFE system to protect its operating systems and with it, it's users, and it needs to start now.
Nothing worth doing is instant
Knee-jerk responses to large problems aren't good long-term solutions. SUBSAFE's basic premise was executed immediately, but didn't really get going for a few years. It took a long time to weed out bad parts from the supply chain and make other changes to the entire pipeline.
Tim Cook is a master of the supply chain, so that's not the problem. Cook was hand-selected by Steve Jobs, and was crafted by the Apple founder for many years to take the position.
In fact, the calls for Cook to step down for the crisis du jour are ridiculous, and any new selectee will not do as good a job with the supply chain. Additionally, any Apple head will take time to get up to speed in other matters given that there would not be any orderly turnover, compounding the problem.
Likewise, ditching Federighi solves nothing except a possible need for a scapegoat. The sudden void at the top will cause confusion, and a lack of focus in a company that needs to get its house in order regarding software quality assurance.
The human element
Regarding procedures and the operators, that process is constantly ongoing in the submarine fleet. With any luck, Apple will be able to take the time to do the same, and re-assess the situation internally with its in-house developers, and externally with users.
I'm not asking for a nine-month intensive classroom training phase followed up by closely supervised device operation before users get set loose, like the Navy demands of its engineers. But, Apple's security promises can only take uneducated users so far.
There will always be users whose device is considered an appliance. There will also always be AppleInsider readers who like to know why something works the way it does, and how to use the device to the maximum extent possible.
Ideally, the two will get together. The latter will talk to the former about security best practices, like physical security in conjunction with software security being the key components to ultimate user safety.
The path forward
Apple has promised changes. The company very quickly issued a statement about the Root bug after it was made public.
"We greatly regret this error and we apologize to all Mac users, both for releasing with this vulnerability and for the concern it has caused," wrote Apple. "Our customers deserve better. We are auditing our development processes to help prevent this from happening again."
This isn't about the march of Apple's version numbers causing problems, or any other related internet-founded silliness. Sierra's initial release 10.12.0 release could have easily been called El Capitan 10.11.7, and High Sierra's first version could have been called 10.12.6 -- but for marketing reasons, Apple incremented the version numbers and gave them fancy names. The same goes for iOS. The X=X+1 version number increment is more of a marketing tool than anything else.
An audit alone won't be enough to fix what ails the testing program, it appears. But, it is the first step on the road to recovery.
Extending the "life" of an operating system won't do anything, nor will lopping off the head of the company because of the misguided view that "this wouldn't have happened if Steve was alive."