A8’s CPU: What Comes After Cyclone?

Despite the importance of the CPU in Apple’s SoC designs, it continues to be surprising just how relatively little we know about their architectures even years after the fact. Even though the CPU was so important that Apple saw the need to create their own custom design, and then did two architectures in just the span of two years, they are not fond of talking about just what it is they have done with their architectures. This, unfortunately, is especially the case at the beginning of an SoC’s lifecycle, and for A8 it isn’t going to be any different.

Overall, from what we can tell the CPU in the A8 is not a significant departure from the CPU in A7, but that is not a bad thing. With Cyclone Apple hit on a very solid design: use a wide, high-IPC design with great latency in order to reach high performance levels at low clock speeds. By keeping the CPU wide and the clock speed low, Apple was able to hit their performance goals without having to push the envelope on power consumption, as lower clock speeds help keep CPU power use in check. It’s all very Intel Core-like, all things considered. Furthermore given the fact that Cyclone was a forward-looking design with ARMv8 AArch64 capabilities and already strong performance, Apple does not face the same pressure to overhaul their CPU architecture like other current ARMv7 CPU designers do.


Close Up: "Enhanced Cyclone"

As a result, from the information we have been able to dig up and the tests we have performed, the A8 CPU is not radically different from Cyclone. To be sure there are some differences that make it clear that this is not just a Cyclone running at slightly higher clock speeds, but we have not seen the same kind of immense overhaul that defined Swift and Cyclone.

Unfortunately Apple has tightened up on information leaks and unintentional publications more than ever with A8, so the amount of information coming out of Apple about this new core is very limited. In fact this time around we don’t even know the name of the CPU. For the time being we are calling it "Enhanced Cyclone" – it’s descriptive of the architecture – but we’re fairly certain that it does have a formal name within Apple to set it apart from Cyclone, a name we hope to discover sooner than later.

In any case one of the things we do know about Enhanced Cyclone is that unlike Apple’s GPU of choice for A8, Apple has seen a significant reduction in the die size of the CPU coming from the 28nm A7 to the 20nm A8. Chipworks’ estimates put the die size of Cyclone at 17.1mm2 versus 12.2mm2 for Enhanced Cyclone. On a relative basis this means that Enhanced Cyclone is 71% the size of Cyclone, which even after accounting for less-than-perfect area scaling still means that Enhanced Cyclone is a relatively bigger CPU composed of more transistors than Cyclone was. It is not dramatically bigger, but it’s bigger to such a degree that it’s clear that Apple has made further improvements over Cyclone.

The question of the moment is what Apple has put their additional transistors and die space to work on. Some of that is no doubt the memory interface, which as we’ve seen earlier L3 cache access times are nearly 20ns faster in our benchmarks. But if we dig deeper things start becoming very interesting.

Apple Custom CPU Core Comparison
  Apple A7 Apple A8
CPU Codename Cyclone "Enhanced Cyclone"
ARM ISA ARMv8-A (32/64-bit) ARMv8-A (32/64-bit)
Issue Width 6 micro-ops 6 micro-ops
Reorder Buffer Size 192 micro-ops 192 micro-ops?
Branch Mispredict Penalty 16 cycles (14 - 19) 16 (14 - 19)?
Integer ALUs 4 4
Load/Store Units 2 2
Addition (FP) Latency 5 cycles 4 cycles
Multiplication (INT) Latency 4 cycles 3 cycles
Branch Units 2 2
Indirect Branch Units 1 1
FP/NEON ALUs 3 3
L1 Cache 64KB I$ + 64KB D$ 64KB I$ + 64KB D$
L2 Cache 1MB 1MB
L3 Cache 4MB 4MB

First and foremost, in much of our testing Enhanced Cyclone performs very similarly to Cyclone. Accounting for the fact that A8 is clocked at 1.4GHz versus 1.3GHz for A7, in many low-level benchmarks the two perform as if they are the same processor. Based on this data it looks like the fundamentals of Cyclone have not been changed for Enhanced Cyclone. Enhanced Cyclone is still a very wide six micro-op architecture, and branch misprediction penalties are similar so that it’s likely we’re looking at the same pipeline length.

However from our low-level tests two specific features stand out: integer multiplication and floating point addition. When it comes to integer multiplication Cyclone had a single multiplication unit and it took four cycles to execute. However against Enhanced Cyclone those operations are now measuring in at three cycles to execute. But more surprising is the total Integer multiplication throughput rate; integer multiplication performance has now more than doubled. While this doesn’t give us enough data to completely draw out Enhanced Cyclone’s integer pathways, all of the data points to Enhanced Cyclone doubling up on its integer multiplication units, meaning Apple’s latest architecture now has two such units.

Meanwhile floating point addition shows similar benefits, though not as great as integer multiplication. Throughput is such that there appears to still be three FP ALUs, but like integer multiplication the instruction latency has been reduced. Apple has managed to shave off a cycle on FP addition, so it now completes in four cycles instead of five. Both of these improvements indicate that Enhanced Cyclone is not identical to Cyclone – the additional INT MUL unit in particular – making them very similar but still subtly different CPU architectures.


Apple iPhone Performance Estimates: Over The Years

Outside of these low-level operations, most other aspects of Enhanced Cyclone seem unchanged. L1 cache remains at 64KB I$ + 64KB D$ per CPU core, where it was most recently doubled for Cyclone. For L2 cache Chipworks believes that there may be separate L2 caches for each CPU core, and while L2 cache bandwidth is looking a little better on Enhanced Cyclone than on Cyclone, it’s not a “smoking gun” that would prove the presence of separate L2 caches. And of course, the L3 cache stands at 4MB, with the aforementioned improvements in latency that we’ve seen.

To borrow an Intel analogy once more, the layout and performance of Enhanced Cyclone relative to Cyclone is quite similar to Intel’s more recent ticks, where smaller feature improvements take place alongside a die shrink. In this case Apple has their die shrink to 20nm; meanwhile they have made some small tweaks to the architecture to improve performance across several scenarios. At the same time Apple has made a moderate bump in clock speed from 1.3GHz to 1.4GHz, but it’s nothing extreme. Ultimately while two CPU architectures does not constitute a pattern, if Apple were to implement tick-tock then this is roughly what it would look like.

Moving on, after completing our low-level tests we also wanted to spend some time comparing Enhanced Cyclone with its predecessor on some high level tests. The low-level tests can tell us if individual operations have been improved while high level tests can tell us something about what the performance impact will be in realistic workloads.

For our first high level benchmark we turn to SPECint2000. Developed by the Standard Performance Evaluation Corporation, SPECint2000 is the integer component of their larger SPEC CPU2000 benchmark. Designed around the turn of the century, officially SPEC CPU2000 has been retired for PC processors, but with mobile processors roughly a decade behind their PC counterparts in performance, SPEC CPU2000 is currently a very good fit for the capabilities of Cyclone and Enhanced Cyclone.

SPECint2000 is composed of 12 benchmarks which are then used to compute a final peak score. Though in our case we’re more interested in the individual results.

SPECint2000 - Estimated Scores
  A8 A7 % Advantage
164.gzip
842
757
11%
175.vpr
1228
1046
17%
176.gcc
1810
1466
23%
181.mcf
1420
915
55%
186.crafty
2021
1687
19%
197.parser
1129
947
19%
252.eon
1933
1641
17%
253.perlbmk
1666
1349
23%
254.gap
1821
1459
24%
255.vortex
1716
1431
19%
256.bzip2
1234
1034
19%
300.twolf
1633
1473
10%

Keeping in mind that A8 is clocked 100MHz (~7.7%) higher than A7, all of the SPECint2000 benchmarks show performance gains above and beyond the clock speed increase, indicating that every benchmark has benefited in some way. Of these benchmarks MCF, GCC, PerlBmk and GAP in particular show the greatest gains, at anywhere between 20% and 55%. Roughly speaking anything that is potentially branch-heavy sees some of the smallest gains while anything that plays into the multiplication changes benefits more.

MCF, a combinatorial optimization benchmark, ends up being the outlier here by far. Given that these are all integer benchmarks, it may very well be that MCF benefits from the integer multiplication improvements the most, as its performance comes very close to tracking the 2X increase in multiplication throughput. This also bodes well for any other kind of work that is similarly bounded by integer multiplication performance, though such workloads are not particularly common in the real world of smartphone use.

Our other set of comparison benchmarks comes from Geekbench 3. Unlike SPECint2000, Geekbench 3 is a mix of integer and floating point workloads, so it will give us a second set of eyes on the integer results along with a take on floating point improvements.

Geekbench 3 - Integer Performance
  A8 A7 % Advantage
AES ST
992.2 MB/s
846.8 MB/s
17%
AES MT
1.93 GB/s
1.64 GB/s
17%
Twofish ST
58.8 MB/s
55.6 MB/s
5%
Twofish MT
116.8 MB/s
110.0 MB/s
6%
SHA1 ST
495.1 MB/s
474.8 MB/s
4%
SHA1 MT
975.8 MB/s
937 MB/s
4%
SHA2 ST
109.9 MB/s
102.2 MB/s
7%
SHA2 MT
219.4 MB/
204.4 MB/s
7%
BZip2Comp ST
5.24 MB/s
4.53 MB/s
15%
BZip2Comp MT
10.3 MB/s
8.82 MB/s
16%
Bzip2Decomp ST
8.4 MB/
7.6 MB/s
10%
Bzip2Decomp MT
16.5 MB/s
15 MB/s
10%
JPG Comp ST
19 MP/s
16.8 MPs
13%
JPG Comp MT
37.6 MP/s
33.3 MP/s
12%
JPG Decomp ST
45.9 MP/s
39 MP/s
17%
JPG Decomp MT
89.3 MP/s
77.1 MP/s
15%
PNG Comp ST
1.26 MP/s
1.14 MP/s
10%
PNG Comp MT
2.51 MP/s
2.26 MP/s
11%
PNG Decomp ST
17.4 MP/s
15.1 MP/s
15%
PNG Decomp MT
34.3 MPs
29.6 MP/s
15%
Sobel ST
71.7 MP/s
58.1 MP/s
23%
Sobel MT
137.1 MP/s
112.4 MP/s
21%
Lua ST
1.64 MB/s
1.34 MB/s
22%
Lua MT
3.22 MB/s
2.64 MB/s
21%
Dijkstra ST
5.57 Mpairs/s
4.04 Mpairs/s
37%
Dijkstra MT
9.43 Mpairs/s
7.26 Mpairs/s
29%

Geekbench’s integer results are overall a bit more muted than SPECint2000’s, but there are still some definite high points and low points among these benchmarks. Crypto performance is among the lesser gains, while Sobel and Dijkstra are among the largest at 21% and 37% respectively. Interestingly in the case of Dijkstra, this does make up for the earlier performance loss Cyclone saw on this benchmark in the move to 64-bit.

Geekbench 3 - Floating Point Performance
  A8 A7 % Advantage
BlackScholes ST
7.85 Mnodes/s
5.89 Mnodes/s
33%
BlackScholes MT
15.5 Mnodes/s
11.8 Mnodes/s
31%
Mandelbrot ST
1.18 GFLOPS
929.4 MFLOPS
26%
Mandelbrot MT
2.34 GFLOPS
1.85 GFLOPS
26%
Sharpen Filter ST
981.7 MFLOPS
854 MFLOPS
14%
Sharpen Filter MT
1.94 MFLOPS
1.7 GFLOPS
14%
Blur Filter ST
1.41 GFLOPS
1.26 GFLOPS
11%
Blur Filter MT
2.78 GFLOPS
2.49 GFLOPS
11%
SGEMM ST
3.83 GFLOPS
3.44 GFLOPS
11%
SGEMM MT
7.48 GFLOPS
6.4 GFLOPS
16%
DGEMM ST
1.87 GFLOPS
1.68 GFLOPS
11%
DGEMM MT
3.61 GFLOPS
3.14 GFLOPS
14%
SFFT ST
1.77 GFLOPS
1.59 GFLOPS
11%
SFFT MT
3.47 GFLOPS
3.18 GFLOPS
9%
DFFT ST
1.68 GFLOPS
1.47 GFLOPS
14%
DFFT MT
3.29 GFLOPS
2.93 GFLOPS
12%
N-Body ST
735.8 Kpairs/s
587.8 Kpairs/s
25%
N-Body MT
1.46 Mpairs/s
1.17 Mpairs/s
24%
Ray Trace ST
2.76 MP/s
2.23 MP/s
23%
Ray Trace MT
5.45 MP/s
4.49 MP/s
21%

While the low-level floating point tests we ran earlier didn’t show as significant a change in the floating point performance of the architecture as it did the integer, our high level benchmarks show that floating point tests are actually faring rather well. Which goes to show that not everything can be captured in low level testing, especially less tangible aspects such as instruction windows. More importantly though this shows that Enhanced Cyclone’s performance gains aren’t just limited to integer workloads but cover floating point as well.

Overall, even without a radical change in architecture, thanks to a combination of clock speed increases, architectural optimizations, and memory latency improvements, Enhanced Cyclone as present in the A8 SoC is looking like a solid step up in performance from Cyclone and the A7. Over the next year Apple is going to face the first real competition in the ARMv8 64-bit space from Cortex-A57 and other high performance designs, and while it’s far too early to guess how those will compare, at the very least we can say that Apple will be going in with a strong hand. More excitingly, most of these performance improvements build upon Apple’s already strong single-threaded IPC, which means that in those stubborn workloads that don’t benefit from multi-core scaling Apple is looking very good.

A8: Apple’s First 20nm SoC A8’s GPU: Imagination Technologies’ PowerVR GX6450
Comments Locked

531 Comments

View All Comments

  • CalaverasGrande - Wednesday, October 1, 2014 - link

    there is more than one iphone competitor with no micro SD.
    That is just a silly argument. But you are just arguing silly specs. Apple has always lagged behind the bleeding edge. Both on computers and IOS devices. They throw a few nice flourishes on top such as retina or touch ID, but the underlying tech has almost always lagged behind the bleed edge. As the author calls out.
  • dmacfour - Wednesday, October 1, 2014 - link

    And it's only bad for people that have some sort of instinctual need to be on the bleeding edge.
  • kidsafe - Thursday, October 2, 2014 - link

    Are you done?
  • TruthLoader - Thursday, October 2, 2014 - link

    Did you really forget to mention one of Apple's new key features, introduced the first time with this new iPhone iteration, a capability prominently displayed by the new
    iPhone 6+ and best described by the words of Apple's CEO:

    Dear iSheeps,

    I am delighted you guys already noticed our brand-new "iBend" feature. We have intentionally kept quiet to preserve the big surprise now unveiled on behalf of our beloved
    iSheep. Let me share the following core principles, which were of particular importance throughout the design and development process:

    1) Enhance our iSheep's ability to enjoy a panoramic perspective, to be able to make "Panoramas" without moving the iPhone or needing any third party software.

    2) We wanted to compete with curved screen models form LG, Motorola and Samsung, mainly offered in their domestic markets.

    3) This is our answer to the curved screen displays offered by LG and Samsung, especially the new Samsung Galaxy Note Edge and the LG G Flex:
    http://www.theverge.com/2014/9/3/6097297/samsung-g...
    http://www.theverge.com/2013/10/27/5036288/lg-g-fl...

    4) It is our firm belief and intention to surprise Samsung and LG by showing that we are capable of having an edged display in our phones without actually having one, all for
    the purpose of trashing their new curved display phones and offering you a new, well hidden, feature.

    5) Last but not the least, we want to sell more replacement screens (remember, screen replacement prices were already provided before our new iPhone launch event took place
    (in anticipation of it:), of course that's a feature, feel free to exchange displays now:)).

    I am sure some of you iTards might be aware of some articles stating that although our new phones cost about 200$ to 250$ to manufacture (now the old ones cost even less),
    http://recode.net/2014/09/23/teardown-shows-apples...
    http://news.investors.com/technology-click/092314-...
    http://www.techtimes.com/articles/16347/20140926/i...

    we are selling them at a huge premium, which means we make a lot of money and I get to enjoy a lot of additional bonifications (indeed, my 15th luxury home has an indoor pool filled
    with 100$ bills, hence I'm able to take a bath without suffocating).

    More money leads to more attractive innovations like this special iBend (Registered Trademark, Patend Pending) feature you guys will be blessed with, as usual.
    Soon we will launch a new iDevice with an additional "S" in its name, it will offer a whole plethora of new features you will be able to make use of, like the possibility to to bend it back and forth to form an S shape. ("iS", Patend Pending)

    I sincerely believe you iSheeps are happy with our new iBend 6 Plus, however please let me take the opportunity to thank you all for being such a giant hoard of ignorant,
    blind and mindless suckers whose whole purpose in life consists of buying our new iDevice/iCrap (Registered Trademark, Patend Pending) for a very high premium while wasting
    their valueless time waiting in the iQueue just to brag about which poor soul enriched me first.
    Always remember and never forget, the only thing premium about apple is price, everything else pales in comparison.

    We Own you.

    Yours Sincerely
    Tim Crook.
  • TruthLoader - Thursday, October 2, 2014 - link

    I'm terribly sorry I did forget to correct some typos, nonetheless, here we go (corrected version):

    Did you really forget to mention one of Apple's new key features, introduced the first time with this new iPhone iteration, a capability prominently displayed by the new
    iPhone 6+ and best described by the words of Apple's CEO:

    Dear iSheep,

    I am delighted you guys already noticed our brand-new "iBend" feature. We have intentionally kept quiet to preserve the big surprise now unveiled on behalf of our beloved
    iSheep. Let me share the following core principles, which were of particular importance throughout the design and development process:

    1) Enhance our iSheep's ability to enjoy a panoramic perspective, to be able to make "Panoramas" without moving the iPhone or needing any third party software.

    2) We wanted to compete with curved screen models form LG, Motorola and Samsung, mainly offered in their domestic markets.

    3) This is our answer to the curved screen displays offered by LG and Samsung, especially the new Samsung Galaxy Note Edge and the LG G Flex:
    http://www.theverge.com/2014/9/3/6097297/samsung-g...
    http://www.theverge.com/2013/10/27/5036288/lg-g-fl...

    4) It is our firm belief and intention to surprise Samsung and LG by showing that we are capable of having an edged display in our phones without actually having one, all for
    the purpose of trashing their new curved display phones and offering you a new, well hidden, feature.

    5) Last but not the least, we want to sell more replacement screens (remember, screen replacement prices were already provided before our new iPhone launch event took place
    (in anticipation of it:), of course that's a feature, feel free to exchange displays now:)).

    I am sure some of you iTards might be aware of some articles stating that although our new phones cost about 200$ to 250$ to manufacture (now the old ones cost even less),
    http://recode.net/2014/09/23/teardown-shows-apples...
    http://news.investors.com/technology-click/092314-...
    http://www.techtimes.com/articles/16347/20140926/i...

    we are selling them at a huge premium, which means we make a lot of money and I get to enjoy a lot of additional bonifications (indeed, my 15th luxury home has an indoor pool filled
    with 100$ bills, hence I'm able to take a bath without suffocating).

    More money leads to more attractive innovations like this special iBend (Registered Trademark, Patent Pending) feature you guys will be blessed with, as usual.
    Soon we will launch a new iDevice with an additional "S" in its name, it will offer a whole plethora of new features you will be able to make use of, like the possibility to to bend it back and forth to form an S shape. ("iS", Patent Pending)

    I sincerely believe you iSheep are happy with our new iBend 6 Plus, however please let me take the opportunity to thank you all for being such a giant hoard of ignorant,
    blind and mindless suckers whose whole purpose in life consists of buying our new iDevice/iCrap (Registered Trademark, Patent Pending) for a very high premium while wasting
    their valueless time waiting in the iQueue just to brag about which poor soul enriched me first.
    Always remember and never forget, the only thing premium about apple is price, everything else pales in comparison.

    We Own you.

    Yours Sincerely
    Tim Crook
  • Kidster3001 - Thursday, October 2, 2014 - link

    Wow, just wow. I agree with most of what you say but you are just going to start fights the way you put it all down. You're not helping.

    BTW, you mention iPhone Galaxy. I agree, the new iPhone resembles recent Galaxy phones very much in physical form. You should take a look at the Galaxy Alpha though. It looks almost identical to an iPhone 5 with the chamfered edges. pretty sad imo.
  • DudeDoe - Monday, October 13, 2014 - link

    Not that everyone else have already call it.... But, plain and simple: No one is forced to buy A or B. If you don´t like it, or don´t have the means, don´t.
    Respect the decision and opinion of the others.
    Or as someone else had pointed: A) The ones that have the means, they truly have the choice, they can either buy it (because they like the style, the tech, or simply because of the ´status factor´), or they can buy a ´dumb phone´ instead (because they don´t care, or don´t have the need).
    B) The ones that don´t have the means. Well those don´t have much of a choice and have to live with what is possible... and accept that, and not coming after the others because "he/she can´t have what he/she really want"
  • Musikus - Monday, October 13, 2014 - link

    Lots of words to say lots of lies. How much döes Samsung pay for these lies? Shame on you, you have no honour and no guts!
  • Pandian - Tuesday, October 14, 2014 - link

    Apple's hardware division - so well integrated with its software division that we do not distinguish the two as we would with most others - makes a strong profit on its devices. Starting with iPhones, iPads, iPods, etc., their profit margin on the hardware seems beyond reason, yet the plastic phones with equivalent or inferior build make MORE profit.

    None of these companies can make such quality devices without the WTO allowing slave labour in China, India, African nations, etc.,to compete at level terms with the labour force of the "developed" nations; the same WTO contract that USA, China, UK, Germany, India, Brazil, Australia and other nations from every dimension of the social or economic space signed!

    That made the 19 year old, 16 hour/day worker from China/India/rest of Asia/Africa AT PAR with highly educated and qualified workers from Germany, UK, USA, India, China, Western European nations, etc., workers who work in less enslaving conditions! If the iPhone 4,5 & 6 series, as well as the HTC, Samsung and such companies' products, were made in Japan, Germany, USA, UK, France, etc., they will cost more than $9000 to $25000 to make! So, the going prices for the hardware, not just from Apple, but the entire spectrum, is a great deal for the consumer.

    Apple's software make huge profit - brain power is tough to quantify!

    The people who steal our money most are the service providing "middlemen"! That includes the companies that allow us to USE these toys! The phone plans, the billing of both parties for the same call - minutes are erased from the caller and the receiver in the USA for the same call! Not one PAC has been formed to fight this.

    These non-producing middlemen include the telephone, cellphone and the CABLE companies! Add the satellite companies if their plans go thru'.

    People drill liquid 1m or 2000m from the surface, refine and sell them for great profits, because the fluid powers ours locomotives. The same companies prevent alternate sources of fuel for the same use! We are so used to it that when the new set of companies do the same, we are numb to the stabs!

    I pay $250 to $800 upfront for a device in the USA, and use it as long as possible, years! Much more of my money is taken from me in much smaller installments every month, adding up to $240+ per family per month, just for phones! Cable and broadband adds another $200+ in most households! THAT is a car payment!

    While the newer smartphones allow me to do more - play more games, be entertained with video of various forms such as games, stupid cats, etc. (paying more there), enjoy the social behaviour of human collective without being social by just staring into a 4-6 inch screen, the phones are much smaller and better than the first simple cellphones! Their primary function is still to be able to make quality phone calls! And, texts, when important. Their super-smart powers are seen when used at trade (stocks), hospitals, and now 24/7 health monitoring! Same device - simple or complex use, still cheap at the physical level; buy it cheaper with a plan that does not suit you, you are shredding your cash.

    So, Apple or Samsung can gouge me for 100% profit on their quality hardware! I am bleeding into a shock state from the "nickel and dime" hemorrhaging of my other services - the phone plans, the contracts, the over the limits, etc.! The cable companies lay down the hardware still poorly to supply broadband, and channel programs that they do not create! There goes my money!
  • MacDaddy100 - Saturday, October 18, 2014 - link

    It's obvious you don't have much experience in technology, you can tell you've been sucked into the Android/ Samsung marketing telling you what you need in a phone.

    It seems you sold on specs and specs only, It's sad that Android phone have to put such large specs, faster GHZ, More RAM just to keep up with the iPhone, depending on which benchmarks you read, at times the iPhone is faster, at times Android is faster, but overall pretty even, that just shows how inefficient Android is, Needs double the specs to keep up.

    You obviously like car analogies, Its like you think a 1000hp Ford Focus will out race a 500hp Porsche 911 on a race track, just cramming horsepower doesn't make it a all-around better.

    Its amazing how much Android keeps copying iPhone features every single year, And Android profits keep sinking FAST, just look at Samsung's recent quarter, complete backslide.

    Why is Android flagship phones still using 20 year old 32 bit technology?

    Its amusing to watch Fandroids brag about their pretty dancing wallpapers, can't you see that Googles precious Green Robot and Samsung marketing machine has you sucked in.

Log in

Don't have an account? Sign up now