Qualcomm this month demonstrated its 48-core Centriq 2400 SoC in action and announced that it had started to sample its first server processor with select customers. The live showcase is an important milestone for the SoC because it proves that the part is functional and is on track for commercialization in the second half of next year.

Qualcomm announced plans to enter the server market more than two years ago, in November 2014, but the first rumors about the company’s intentions to develop server CPUs emerged long before that. In fact, being one of the largest designers of ARM-based SoCs for mobile devices, Qualcomm was well prepared to move beyond smartphones and tablets. However, while it is not easy to develop a custom ARMv8 processor core and build a server-grade SoC, building an ecosystem around such chip is even more complicated in a world where ARM-based servers are typically used in isolated cases. From the very start, Qualcomm has been rather serious not only about the processors themselves but also about the ecosystem and support by third parties (Facebook was one of the first companies to support Qualcomm’s server efforts). In 2015, Qualcomm teamed up with Xilinx and Mellanox to ensure that its server SoCs are compatible with FPGA-based accelerators and data-center connectivity solutions (the fruits of this partnership will likely emerge in 2018 at best). Then it released a development platform featuring its custom 24-core ARMv8 SoC that it made available to customers and various partners among ISVs, IHVs and so on. Earlier this year the company co-founded the CCIX consortium to standardize various special-purpose accelerators for data-centers and make certain that its processors can support them. Taking into account all the evangelization and preparation work that Qualcomm has disclosed so far, it is evident that the company is very serious about its server business.

From the hardware standpoint, Qualcomm’s initial server platform will rely on the company’s Centriq 2400-series family of microprocessors that will be made using a 10 nm FinFET fabrication process in the second half of next year. Qualcomm does not name the exact manufacturing technology, but the timeframe points to either performance-optimized Samsung’s 10LPP or TSMC’s CLN10FF (keep in mind that TSMC has a lot of experience fabbing large chips and a 48-core SoC is not going to be small). The key element of the Centriq 2400 will be Qualcomm’s custom ARMv8-compliant 64-bit core code-named Falkor. Qualcomm has yet has to disclose more information about Falkor, but the important thing here is that this core was purpose-built for data-center applications, which means that it will likely be faster than the company’s cores used inside mobile SoCs when running appropriate workloads. Qualcomm currently keeps peculiarities of its cores under wraps, but it is logical to expect the developer to increase frequency potential of the Falkor cores (vs mobile ones), add support of L3 cache and make other tweaks to maximize their performance. The SoCs do not support any multi-threading or SMP technologies, hence boxes based on the Centriq 2400-series will be single-socket machines able to handle up to 48 threads. The core count is an obvious promotional point that Qualcomm is going to use over competing offerings and it is naturally going to capitalize on the fact that it takes two Intel multi-core CPUs to offer the same amount of physical cores. Another advantage of the Qualcomm Centriq over rivals could be the integration of various I/O components (storage, network, basic graphics, etc.) that are now supported by PCH or other chips, but that is something that the company yet has to confirm.

From the platform point of view, Qualcomm follows ARM’s guidelines for servers, which is why machines running the Centriq 2400-series SoC will be compliant with ARM’s server base system architecture and server base boot requirements. The former is not a mandatory specification, but it defines an architecture that developers of OSes, hypervisors, software and firmware can rely on. As a result, servers compliant with the SBSA promise to support more software and hardware components out-of-the-box, an important thing for high-volume products. Apart from giant cloud companies like Amazon, Facebook, Google and Microsoft that develop their own software (and who are evaluating Centriq CPUs), Qualcomm targets traditional server OEMs like Quanta or Wiwynn (a subsidiary of Wistron) with the Centriq and for these companies having software compatibility matters a lot. On the other hand, Qualcomm’s primary server targets are large cloud companies, whereas server makers do not have their samples of Centriq yet.

During the presentation, Qualcomm demonstrated Centriq 2400-based 1U 1P servers running Apache Spark, Hadoop on Linux, and Java: a typical set of server software. No performance numbers were shared and the company did not open up the boxes so not to disclose any further information about the CPUs (i.e., the number of DDR memory channels, type of cooling, supported storage options, etc.).

Qualcomm intends to start selling its Centriq 2400-series processors in the second half of next year. Typically it takes developers of server platforms a year to polish off their designs before they can ship them, normally it would make sense to expect Centriq 2400-based machines to emerge in the latter half of 2H 2017. But since Qualcomm wants to address operators of cloud data-centers first and companies like Facebook and Google develop and build their own servers, they do not have to extensively test them in different applications, but just make sure that the chips can run their software stack.

As for the server world outside of cloud companies, it remains to be seen whether the server industry is going to bite Qualcomm’s server platform given the lukewarm welcome for ARMv8 servers in general. For these markets, performance, compatibility, and longevity are all critical factors in adopting a new set of protocols.

Related Reading:

Source: Qualcomm

Comments Locked

88 Comments

View All Comments

  • Wilco1 - Sunday, December 18, 2016 - link

    No wrong again. ARM does design and license everything required to make a SoC. On top of that ARM does physical IP for lots of processes and sells pre-hardened cores that are optimized for a specific process and ready for use with minimal integration. So except for actually making chips that's as much as Intel does. Licensees decide how much additional work they want to do, some take ARM's cores as is, some do fine tuning, others design their own cores.

    It's true with mobile Atom Intel was a lot more behind on uncore than core. But they were late with everything, uncore, SoC, decent GPUs, faster CPUs, radio, mobile process etc. Apparently before canning it all, earlier this year they finally finished their very first SoC with on-chip radio - not exactly a barn burner at 1.5GHz on TSMC 28nm! I told people years ago SoFIA would be beaten by faster, smaller and cheaper Cortex-A53's. So no, I don't know the way Intel think, pretty much every move they made didn't make sense to me. Plenty of people fell for the marketing though.
  • deltaFx2 - Sunday, December 18, 2016 - link

    @ Wilco1: Developing any CPU isn't easy; first-order, ISA doesn't make things easier; x86 has more legacy verification, but it's small compared to the effort in verifying any CPU. The thing about ARM, though, is that ARM doesn't actually build anything. Even in its simplest avatar, ARM leaves the integration to the vendors. It's not an apples-to-apples comparison. In the same duration as it did 3 atom cores, Intel produced Sandy Bridge/Ivy Bridge/Nehalem/Westmere/Haswell/BroadWell/Skylake/KabyLake. It's a question of focus, I think, more than anything else. Intel neglected the Atom line initially, only to realize late in the game that the cellphone market was getting real. But it was too late.
  • name99 - Sunday, December 18, 2016 - link

    I am not making the numbers up.
    The Samsung number comes from their talk at Hot Chips 2016. The Nehalem number comes from a talk given to Stanford EE380 soon after Nehalem was released.
  • name99 - Sunday, December 18, 2016 - link

    The slides for the talk are here. The video seems to have disappeared (it was once public).
    The slides refer to 5+ design years, but in the talk he said the time kept growing and was at around 7 years in 2010.
    http://web.stanford.edu/class/ee380/Abstracts/1002...
  • Kevin G - Monday, December 19, 2016 - link

    @name99

    Is this the video you were referring to?

    https://www.youtube.com/watch?v=BBMeplaz0HA

    That video was filmed in 2006 and was uploaded in June of 2008, both prior to Nehalem being released. It does have plenty of insights into how the industry was working during that time frame.
  • name99 - Monday, December 19, 2016 - link

    Obviously it's not the talk I was referring to! Look at the slides. The talk was given by Glenn Hinton in Feb 2010.
  • Kevin G - Monday, December 19, 2016 - link

    @name99

    Lets try these droids:

    http://www.yovisto.com/video/17687
  • name99 - Monday, December 19, 2016 - link

    Well done! Kevin G wins the internets for today!
  • chlamchowder - Sunday, December 18, 2016 - link

    In response to (b), buyers are not captive. But no other chip maker can offer competitive per-thread performance, on any architecture.

    With multithreaded performance, AMD tried with Bulldozer/Piledriver (lower price), and IBM tried with Power8 (more perf, but more heat). But Intel's still dominating servers.
  • Michael Bay - Monday, December 19, 2016 - link

    You`re talking to intel-hating fruit fanatic, what did you expect?

Log in

Don't have an account? Sign up now