Last Friday, Tom's Guide published an article calling Samsung's flagship Galaxy S6 the "world's fastest smartphone," writing that it came in "first in 6 out of 9 real-world tests and synthetic benchmarks." Unfortunately, that claim relied on cherry picked figures and ignored the real world entirely.
Damage control for Samsung
Samsung's latest Galaxy S6 flagship phone was introduced in April, seven months after iPhone 6 launched last September. Because each generation of smartphones deliver big computational leaps over the previous year (with graphics performance increasing even faster), Samsung's latest high end phone should be significantly faster than iPhone 6, just because it is newer. However, it isn't.
Three months ago, when the first Galaxy S6 benchmarks appeared, AppleInsider reported that its graphics performance scores in GFX Bench were improved over the Galaxy S5, but were still well behind Apple's iPhone 6.
In fact, at its native resolution, Galaxy S6 even performed worse than 2013's iPhone 5s because it had so many extra pixels to move. Samsung's poor engineering choice of pairing an ultra high resolution screen with an underpowered CPU and GPU resulted in great specs on paper but poor real-world performance.
Three months of troubled Samsung sales later, Sam Rutherford and Alex H. Cranz collaborated on a report for Tom's Guide that ostensibly refuted that data by simply ignoring it. The story makes no reference to the troubling GFX Bench scores at all.
Creating "real world" tests to achieve a first place score
Instead, the site created four "real world" tests where tasks were performed (such as opening a PDF, launching the camera app, playing a video game and transcoding a video) and filmed with a slow motion camera to find millisecond differences.
The PDF "load time" test involved a 1.6 GB file, but didn't clarify what app was used (simply a viewer, or a more powerful editor?). In any event, the difference between the Galaxy S6 and iPhone 6 was 20 milliseconds. You'd need a high speed camera to notice that difference.
Other Android phones performed slowly enough to readily notice the delay. Google's Nexus 6 took almost four times as long to open the file, for example, highlighting that stock Android isn't necessarily fast.
A second test, of camera load time, similarly pitted different apps against each other, but still required a slow motion camera to observe any difference between iPhone 6 and two phones from LG and Samsung; each opened up a camera app within 10 milliseconds of difference. Again, Nexus 6 brought up the rear with a score more than twice as slow as iPhone 6 and the higher-end phones from LG and Samsung.
In a video gaming performance test profiled three minutes within Asphalt 8, iPhone 6 achieved the same frame rate as Galaxy S6. Despite being slower at its native resolution, Galaxy S6 plays most games at a lower, non-native resolution to keep up. But despite more cores, more RAM and a faster clocked Application Processor, the Galaxy S6 couldn't claim a speed advantage even running at a lower, non-native resolution.
Samsung's more expensive Galaxy S6 doesn't beat iPhone 6 in gaming
A fourth test used an app to transcode a 1080p movie to 480p, but this task wasn't even performed on an iPhone 6 because the benchmark app used was for Android. The authors stated that "this real-world scenario paints an accurate picture of CPU performance," but the VidTrim app they used actually relies on specialized ARMv7 NEON hardware acceleration rather than reflecting general CPU performance.
Samsung's "first place," more expensive hardware doesn't deliver a real world advantage
In their "real world" tests, Toms Guide found no discernible differences between Samsung's Galaxy S6 and iPhone 6, despite Samsung's device being priced higher and shipping seven months later with an "octa core" CPU (vs. Apple's dual core A8) and packing three times as much system RAM using faster DDR4 chips. Is Google's Android 5.0 really squandering resources so badly that an extra 2GB of faster DDR4 system memory and a faster clocked CPU with eight cores can't show any tangible performance improvement over last year's iPhone 6?
Is Google's Android 5.0 really squandering resources so badly that an extra 2GB of faster DDR4 system memory and a faster clocked CPU with eight cores can't show any tangible performance improvement over last year's iPhone 6?
Android 5.0 Lollipop was supposed to usher in much faster app performance via ART "ahead of time" compilation (described as "Google's 2-year-long ongoing secret project, which aims to boost the performance of our Android devices") by replacing Android's Dalvik "just in time" virtual machine. Yet a year's worth of new hardware upgrades and a new version of Android have done nothing to help Samsung beat the phone Apple introduced last year.
Even worse, in its native resolution (where the device typically runs, apart from entering a lower resolution video game), Samsung's engineering choices mean that the Galaxy S6 delivers noticeably worse performance. GFX Bench shows that Samsung's "Exynos 7" powered Galaxy S6 drops down to 15 fps--just 78 percent of the frame rate of iPhone 6 Plus--in the same native resolution test.
Artificial tests to get Samsung into "first place" fraudulently
While ignoring GFX Bench numbers detailing raw GPU performance at both native and 1080p resolutions, Toms Guide also included some GPU benchmarks of its own. As is typical of Android fan sites, the authors selected the FutureMark 3D Mark Ice Storm Unlimited. Like GFX Bench, it includes rendering tests of the GPU.
But the real reason Android fans like 3D Mark Ice Storm Unlimited is because it makes Android look like a winner with zero regard for reality. That's because in addition to GPU tests, it incorporates physics tests that use a software library optimized for Android but which is neither used on nor optimized for iOS, returning slows iPhone scores to deliver a false fail, consistently.
This is readily apparent in the Toms Guide article, which notes that iPhone 6 tied or beat the fastest Android devices (despite their faster clock rates, more RAM and additional cores, above) in the playback of a real video game, but somehow manages to come up at the bottom of the rankings in 3D Mark Ice Storm Unlimited, below even the Nexus 6 that delivered the worst real game performance and also flunked all of the other "real world" tests in the same article.
3D Mark Ice Storm Unlimited contradicts real world gaming performance
Shouldn't the site's editor noticed that?
When you see FutureMark 3D Mark Ice Storm Unlimited test results appearing in a review or "shootout," you know its a staged effort to make iPhones look bad and to make Android devices that flunk real tests (like the Nexus 6) look better than they actually are. It does a great job of presenting phony numbers that Android fans love.
Benchmarking to find a flattering score
Another artificial benchmark included in the article is Geekbench 3, a test we've frequently used too, both on mobile devices and the version that runs on desktop Macs and PCs. For mobile devices, Geekbench includes two primary scores: one for single core performance, and a multicore score that tests the performance of all available cores working together.
The single core score shows how powerful the CPU is at basic, routine tasks, because there are relatively few cases where a smartphone user (or the operating system, working with apps) can effectively get all device cores working in tandem, perfectly, the way a benchmark app does.
The two benchmark scores are somewhat similar to measuring what kind of horsepower a vehicle delivers at city and highway speeds versus on a test track, where you're measuring theoretical performance rarely encountered in actual driving conditions.
Apple optimizes iPhones to work well in single core operations and to perform even better whenever it's possible to get both cores working. Because of the energy drain, most of the time the system seeks to turn off unused cores to preserve battery life. Apple's A8 has two relatively large cores, a design its ARM CPUs have stuck close to ever since the first iPods, because it is a great balance of performance and efficiency.
Samsung uses an 8-core "big.LITTLE" design by ARM. Four of those cores are muscle cores intended to share general purpose tasks, while the other four are lightweight cores designed to be used in order to coast along with low power consumption. That means using all 8 cores at once really makes little sense, apart from achieving an artificial benchmark for a short period of time.
However, the 8 core design also trades off the ability to use more of the chip most of the time, because it devotes more surface area to more, smaller cores. Because most tasks can only use a single core at once, this means Apple's two large A8 cores can devote half of the CPU chip area to single core tasks. Samsung not only leaves one set of four cores idle, but also quarters up the remaining in-use chip surface area to deliver a smaller single engine for most of the tasks a smartphone user actually does in the real world.
So while "8 cores" might sound impressive, it actually doesn't mean it can do four times (or even twice) the work of a dual core chip. In fact, in single core operations, Apple's A8 in iPhone 6 is actually 40 percent faster than the Exynos 7 in Samsung's Galaxy S6, according to Geekbench 3.Apple is optimizing its performance so that iPhone users get the fastest response most of the time in real conditions running real apps, while Samsung is optimizing for benchmark scores under conditions that will rarely be encountered in real world use
Toms Guide didn't explain any of that, and didn't even cite the single core score. Instead, it focused on a single number: multicore. According to Geekbench 3 records, Galaxy S6 is 36 percent faster than iPhone 6 in multicore scores. That means iPhone 6 is 40 percent faster at doing everyday things, while Samsung's Galaxy S6 is 36 percent faster at running this type of full tilt benchmark.
While using both cores allows the A8 chip to ramp up its Geekbench score performance by 79.4 percent (not quite doubling its score by keeping both cores busy), turning on all the cores on the Exynos 7 results in a 240 percent increase. That's not even close to a four fold increase, let alone eight, but in both cases, it's clear that Apple is optimizing its performance so that iPhone users get the fastest response most of the time in real conditions running real apps, while Samsung is optimizing for benchmark scores under conditions that will rarely be encountered in real world use.
Samsung has a long history of working to cheat on benchmarks in a variety of ways to impress people who only look at certain numbers without really considering what those figures actually mean. But with the development of its Exynos "octa core" chips, Samsung has created an entire product optimized for scores, rather than for actual use.
A Geekbench score radically higher than Geekbench
Toms Guidenot only selectively picked one number to proclaim Samsung had "crushed the competition" in benchmarks (after failing to even better last year's iPhone in a series of "real world" tests), but also obtained a score for the Galaxy S6 that is wildly higher than those recorded by Geekbench from its other users.
The site reported a multicore score of 5,283, but Geekbench browser says that phone (Samsung Exynos 7420 Galaxy S6, 1500 MHz, 8 cores) actually scores 3925. That's a tremendous difference in scores from the same benchmark being run on the same phone. It's also completely out of line with the score discrepancy it reported from other vendors.
Toms Guide Geekbench score for Galaxy S6 is not what real users are seeing
The Toms Guide Geekbench score for iPhone 6, for example, is 38 points from the average score reported by Geekbench browser. The score Toms Guide published for Galaxy S6 is 1358 points higher than average, or over a third higher than actual users were reporting. That should raise a flag, because there typically isn't nearly that much difference between scores run on the same hardware.
Engineering vs benchmark optimization
However, even on the same hardware a benchmark can show a big difference if the device is optimized in some way. For example, Samsung has squeezed extra "performance points" from the same device in the past by turning off battery life and thermal safeguards when the device ran a benchmark app. If people actually used their devices in that "benchmark mode" all the time, their phones would overheat and die rapidly.
Conversely, Apple's new power saving mode in iOS 9 offers to extend battery life by scaling down how fast the processor runs. Other optimizations, such as hardware accelerated codecs (Android full disk encryption, if turned on full time like iOS, slashes memory performance by 50 to 80 percent) or Apple's Metal (which increases CPU performance by more efficiently tasking the GPU involving less overhead) can also affect benchmark scores.
While celebrating the unusual, artificial score of its Galaxy S6, Toms Guide didn't publish any data on how long the device could achieve that benchmark before killing its battery. The site also included separate FutureMark scores for system and memory, both of which also delivered wildly different scores that contradicted the actual performance the writers observed in their real world tests. A fourth test benchmarked WiFi throughput, without saying much about the test's methodology.
In a bizarre summary, the site lauded the "DDR4 memory and speedy UFS storage" used in the Samsung Galaxy S6, even though its Basemark OS II memory-specific test showed that LG's G4 beat it, despite using slower DDR3 memory and not using the "speedy" new UFS. Do the tests mean anything, or are they just for creating nice charts used to adorn a preconceived notion that the technologies Samsung is advertising are actually contributing anything in the real world?
Additionally, Toms Guide didn't note that its benchmarks were testing apples and oranges. iPhone 6 provides Full Disk Encryption by default, while Samsung's Galaxy S6 (and other Android phones) ship with encryption turned off, because FDE under Android significantly taxes memory performance. Running benchmarks with a core security feature turned off delivers a flattering benchmark, but does not actually make Galaxy S6 "faster" in any meaningful sense. It's just less secure.
Rather than proving that Samsung S6 was the "fastest smartphone," the report really only actually demonstrated that fancy specs and new tech that's half of an annual cycle ahead of Apple's last phone couldn't deliver real performance gains, even if the authors could tease out benchmark numbers that contradicted reality.
Further, the site not only dismissed real benchmarks to favor phony ones, but also completely ignored the reality that iPhone 6 is optimized with a series of benefits not observable by the simplistic "open a file" tests orchestrated by, supposedly, a technically savvy web site. That includes the fact that Apple's app platform is now fully migrated to 64-bit (a requirement for App Store approval), and that graphics apps and games are incorporating support for Metal.
Metal's advantage is hard to compare against Android in part because most GPU tests don't measure both CPU and GPU activity at once (GFX Bench, for example, focuses on GPU performance by rendering a staged 3D scene, with no CPU-bound user interaction), and in part because fewer real world apps and games are even available for Android. That includes the movie and photo editing tools Apple used to demonstrate Metal.
The fact that Apple's iOS platform has long attracted a wide variety of exclusive apps undermines the value of "faster hardware" on other platforms anyway. Back in 2012, 4G LTE Android phones were clearly faster than 3G-only iPhones, but Apple's stronger ecosystem still attracted the most premium buyers. Today there is no technical deficit among iPhones, and Apple is now leading CPU and GPU design.
Toms Guide wasn't embarrassed to coronate Samsung's flagship as "fastest" despite its carrying a list price $100 more than the LG G4 that tied or beat it in most tests. Instead, the site said Samsung "blew away the field," when its own data shows that really wasn't the case.