Sensible Scaling: OoO Atom Remains Dual-Issue

The architectural progression from Apple, ARM and Qualcomm have all been towards wider, out-of-order cores, to varying degrees. With Swift and Krait, Apple and Qualcomm both went wider. From Cortex A8 to A9 ARM went OoO and then from A9 to A15 ARM introduced a significantly wider architecture. Intel bucks the trend a bit by keeping the overall machine width unchanged with Silvermont. This is still a 2-wide architecture.

At the risk of oversimplifying the decision here, Intel had to weigh die area, power consumption as well as the risk of making Atom too good when it made the decision to keep Silvermont’s design width the same as Bonnell. A wider front end would require a wider execution engine, and Intel believed it didn’t need to go that far (yet) in order to deliver really good performance.

Keeping in mind that Intel’s Bonnell core is already faster than ARM’s Cortex A9 and Qualcomm’s Krait 200, if Intel could get significant gains out of Silvermont without going wider - why not? And that’s exactly what’s happened here.

If I had to describe Intel’s design philosophy with Silvermont it would be sensible scaling. We’ve seen this from Apple with Swift, and from Qualcomm with the Krait 200 to Krait 300 transition. Remember the design rule put in place back with the original Atom: for every 2% increase in performance, the Atom architects could at most increase power by 1%. In other words, performance can go up, but performance per watt cannot go down. Silvermont maintains that design philosophy, and I think I have some idea of how.

Previous versions of Atom used Hyper Threading to get good utilization of execution resources. Hyper Threading had a power penalty associated with it, but the performance uplift was enough to justify it. At 22nm, Intel had enough die area (thanks to transistor scaling) to just add in more cores rather than rely on HT for better threaded performance so Hyper Threading was out. The power savings Intel got from getting rid of Hyper Threading were then allocated to making Silvermont an out-of-order design, which in turn helped drive up efficient use of the execution resources without HT. It turns out that at 22nm the die area Intel would’ve spent on enabling HT was roughly the same as Silvermont’s re-order buffer and OoO logic, so there wasn’t even an area penalty for the move.

The Original Atom microarchitecture

Remaining a 2-wide architecture is a bit misleading as the combination of the x86 ISA and treating many x86 ops as single operations down the pipe made Atom physically wider than its block diagram would otherwise lead you to believe. Remember that with the first version of Atom, Intel enabled the treatment of load-op-store and load-op-execute instructions as single operations post decode. Instead of these instruction combinations decoding into multiple micro-ops, they are handled like single operations throughout the entire pipeline. This continues to be true in Silvermont, so the advantage remains (it also helps explain why Intel’s 2-wide architecture can deliver comparable IPC to ARM’s 3-wide Cortex A15).

While Silvermont still only has two x86 decoders at the front end of the pipeline, the decoders are more capable. While many x86 instructions will decode directly into a single micro-op, some more complex instructions require microcode assist and can’t go through the simple decode paths. With Silvermont, Intel beefed up the simple decoders to be able to handle more (not all) microcoded instructions.

Silvermont includes a loop stream buffer that can be used to clock gate fetch and decode logic in the event that the processor detects it’s executing the same instructions in a loop.


Silvermont’s execution core looks similar to Bonnell before it, but obviously now the design supports out-of-order execution. Silvermont’s execution units have been redesigned to be lower latency. Some FP operations are now quicker, as well as integer multiplies.

Loads can execute out of order. Don’t be fooled by the block diagram, Silvermont can issue one load and one store in parallel.


OoOE & The Pipeline ISA, IPC & Frequency
Comments Locked


View All Comments

  • Hector2 - Friday, May 17, 2013 - link

    There are only 3 companies right now left in the world who have the muscle and volume to afford high tech fabs -- Intel, Samsung & TSMC. And Intel has about a 2 year lead. That means not just higher performance and lower power than before, but lower cost. Making the chips smaller multiplies the number of chips on a single, fixed-cost wafer and lowers costs. If the chip area is 1/2, the costs to make it are about 1/2 as well. 22nm tech gives Intel faster chips with less power than their competition. 14nm hits it out of the park.
  • BMNify - Wednesday, June 5, 2013 - link

    You're absolutely wrong about "lower cost". x86 requires more die area. The process is more volatile (more failed wafers).

    If we combine the 2 above factors with better performance, lower power consumption and toss in a lack of experience we get GT3e. A technological marvel that few (OEMs) want.
  • BMNify - Wednesday, June 5, 2013 - link

    Spot on Krysto - It's Intel's process advantage that is shining through. Soon they'll hit the point of diminishing returns and/or the rest of the market will catch up/get close enough. When I see AMD at 32nm (Richland) having lower power draw at idle than Intel at 22nm (Ivy Bridge) I wonder how special their "secret sauce" actually is.

    How long can Intel loss-lead? Probably as long as Xeon continues to make up for it but ARM is getting into the server market now too (looking forward to AMD and Calexda ARM SoCs for the server market). Should be interesting in 3-5 years
  • TheinsanegamerN - Monday, August 26, 2013 - link

    only issue, though, is when you put that richland chip under load. all of a sudden, intel is using much less power.
  • t.s. - Monday, May 6, 2013 - link

    "The mobile market is far more competitive than the PC industry was back when Conroe hit. There isn’t just one AMD, but many competitors in the SoC space that are already very lean fast moving. There’s also the fact that Intel doesn’t have tremendous marketshare in ultra mobile."

    Well, with their 'strategy' back then when facing AMD (, they surely'll win. :p
  • nunomoreira10 - Monday, May 6, 2013 - link

    It´s kinda suspicious that there are many comparisons against arm but none against Amd jaguar or even bobcat.
    jaguar will probably be a much better tablet cpu and gpu, while intel competes on the phone market.
  • Khato - Monday, May 6, 2013 - link

    Which AMD Jaguar/Bobcat SKU runs at 1.5 watts? They aren't included in the comparison because they're a markedly higher power level.
  • nunomoreira10 - Monday, May 6, 2013 - link

    they will both be used on fan-less tablet designs...
  • extide - Tuesday, May 7, 2013 - link

    Totally different markets. Jaguar/Bobcat will likely line up next to low end Core/Haswell, not an Atom/Silvermont
  • Penti - Tuesday, May 7, 2013 - link

    Both will sadly be way to underpowered when it comes to the GPU, and that matters greatly on general OS's and applications like running a desktop OS X or Windows (or GNU/Linux) machine. You won't really be able to game on them at all as it's not smartphone games people want to run. GPGPU won't really be fast enough for anything and we talk about ~100-200 GFLOPs GPU-power on the AMD side for what is essentially a full blown computer.

    Intel is clearly targeting the phone market. Something AMD/ATI divested from years back with their mobile GPU tech going to Qualcomm (Adreno, which isn't Radeon-based) and Broadcom. ATIs/AMDs mobile GPU-tech was before that previously licensed to or used together with the likes of Intel (PXA/XScale – not integrated though), Samsung and Freescale among others. Their technology already is the mainstay of the mobile business and was departed from the company but in effect their technology know how was successful in the market without their leadership so why would they compete with that, of course they wouldn't.

    AMD simply has not and will not likely any time soon invest in an alternate route to dominate their own part of the smartphone/ARM-tablet market while Intel has with integrated designs replacing the custom ARMv5TE design. AMD going after ARM-business is different since they will license the core and their manufacturer GloFo already does manufactures and even offers hard macros for ARM-designs that they sell a bunch of to other customers already. It's also going after other embedded fields and the emerging ARM-server/appliance space all without designing custom cores.

    While PXA (Intel) was quite successful in the market, moving to x86 and doing away with stuff like ARM-based network processors, raid-processors allows Intel to focus on delivering great support for modern ISA across all sorts of devices, while it didn't make it into phones (until lately) like PXA which continued to power Blackberrys under Marvell, was the main Windows Mobile platform for years after Intels departure and so on it was able to become a multimediaplatform, and a widely adopted chip for embedded use, driving NAS-devices and the like. Thanks to the Intel purchase of Infineons Wireless portfolio including many popular 3G radios/modems and them forming a new wireless division their actual business and sales in the mobile market is also much higher than when they still had their custom PXA/XScale lineup. Plus they couldn't have competed with their XScale lineup without designing new ARM-ISA compatible cores/designs to be able to match Cortex A8, A9, A7, A15, Krait 600 etc. Plus puts them in a much better place to be a wireless/terminal supplier when they can support customers who want advanced wireless modems/baseband, Application processors, bt, wifi etc. While Nvidia will have Tegra 4i with integrated modem AMD couldn't offer anything similar as they have no team capable of producing radio baseband. Having modern compilers and x86-ISA sure makes it convenient now for Intel, as well as integrating their own GPU, just licensing ARM Ltd designs wouldn't have put them in a better position to continue their presence in the mobile field. They have basically developed and scaled their desktop GNU/Linux drivers in the Linux Kernel, added mobile features and so on years before they put the hardware and can leverage that software in mobile platforms (Android) but it makes sense and they don't have to rely on IP cores and third party drivers for graphics with the coming Bay Trail. They couldn't have shared that much tech if they were anything else then x86. Of course AMD won't be in the same place and scaling down a GPU designed for thousands of stream processors and Windows/OS X drivers to put it into phones is not the same. It would be awful if it is just scaled down to fit the power usage, even if Nvidia has kinda custom mobile gpu it's still worse then the competitors which has no presence in desktop computing. Drivers for QNX, Android/Linux, iOS etc is not the same as with Windows either. It takes a long time to start over when they did away with an okay solution (z460), and they haven't but other have and thats fine, there is more competition here then elsewhere. x86 is no stopper for Intel.

Log in

Don't have an account? Sign up now