Over the past 15 years, chip design giant Arm has been on quite a journey within the infrastructure space, Mohamed Awad, SVP and GM of Arm’s infrastructure business, tells DCD.
What started as a joint venture between Acorn Computers, Apple, and NXP Semiconductors, Arm – an acronym of Advanced RISC Machines – was officially born in 1990, with the company starting to license its processor IP in 1993.
A $32 billion takeover by SoftBank followed in 2016, with Cambridge-based Arm being delisted from the London Stock Exchange. It went public again in 2023, though SoftBank remains its biggest shareholder.
When it comes to chip architecture, Arm now dominates the market, but that wasn’t always the case. Traditionally, Intel had the server market cornered with its x86 chips but, as hyperscalers increasingly look to develop their own CPUs, Arm’s offering has grown in popularity as it allows companies to design a chip that can be customized to meet their exact needs.
Arm’s initial focus was on offering general-purpose computing solutions and ensuring its software allowed for frictionless deployment, but in 2018, the company shifted focus, launching its Neoverse-based CPUs.
Designed specifically for data center, Edge, and HPC use cases, Neoverse allowed Arm to pivot from simply providing the basic building blocks for infrastructure solutions – or offering cores which were designed for some other purpose, like mobile, but could then be put together into a system-on-chip (SOC) – to developing technology specific for infrastructure.
Arm divides its Neoverse offering into three separate groups, but Awad says the product line can primarily be broken down into two main product lines: The V series, which focuses on high-performance, general-purpose compute, and the N series, which focuses on the server market. The company also offers an E series for Edge compute.
Arm’s V series – V2 specifically – is what AWS Graviton Four, Google, Axion, and Nvidia Grace are based on, while Microsoft Cobalt’s chip is a product of the N series.
While Awad jokes that AI is taking “a lot of oxygen out of the room” at present, he says HPC is still “grinding away,” and Arm is continuing to see quite a bit of traction with such workloads.
“The V series is very popular in the HPC space, and Graviton, Grace, and Axion are all based on the V2 series, while I think the N series is also popular, but more with cloud-type workloads or networking use cases,” Awad says. “Our N series has more of a focus on efficiency. It’s great for workloads where you’ve got lots of parallel workloads happening simultaneously.”
When it comes to the N series, while the focus is often on the server market – “when you talk about data centers, it’s certainly a very important part of the story” – Awad notes that across the industry there have been a number of transformations with other component parts in recent years that have seen the conversation move away from being solely focused on compute, particularly as products such as DPUs have entered the market and forced changes in areas like networking as a result of the connectivity challenges associated with them.
When it comes to the N series, while the focus is often on the server market – “when you talk about data centers, it’s certainly a very important part of the story” – Awad notes that across the industry there have been a number of transformations with other component parts in recent years that have seen the conversation move away from being solely focused on compute, particularly as products such as DPUs have entered the market and forced changes in areas like networking as a result of the connectivity challenges associated with them.
“Networking is evolving, storage is changing in a meaningful way, even wireless base stations are changing, so, from an infrastructure perspective, we’ve made sure that our N Series and V Series of products exist across that entire spectrum,” Awad says.
“More and more of the ecosystem or the infrastructure is moving – and this has been a transition that’s been going on for a decade – to a kind of software-defined infrastructure that has a level of commonality across that entire compute fabric.”
Consequently, when it comes to architecture, Awad says that understanding these changes is incredibly important because organizations increasingly want to add acceleration across the fabric, be it networking, storage, or servers, to allow themselves to keep up.
Changing the game
When Neoverse was first released, Awad says it was about allowing companies to leverage Arm’s heritage of performance and efficiency to provide immediate TCO gains within the context of a software ecosystem that “wasn’t perfect back five or six years ago, but was good enough for them to get going.”
However, in the last couple of years, Arm decided it wanted to go further than just providing an architectural license that would still require a lot of work and investment to build something on, and has evolved Neoverse in order to meet the changing needs of its customers.
“Firstly, we decided to say, ‘okay, we’re going to hand you the CPU and the interconnect, and you can stitch it together,’” Awad says. “That’s a lot less work than just handing over an architectural spec, but we were aware it still required a meaningful amount of work.
“Then we took that approach to compute subsystems, where we actually take the CPUs and the interconnect and stitch them all together ourselves. This means we can hand over an integrated, verified subsystem that can then very quickly be customized for specific use cases, allowing you to optimize the silicon around your data center without enormous amounts of investment.
“That has been a game changer, and you can see it with the likes of Microsoft Cobalt, where, very quickly, the team has been able to show up with a solution and market by leveraging our compute subsystems.”
As a result, from Awad’s perspective, the last five years have been “nothing short of amazing.”
He says: “It started with AWS, with Graviton, but today, every major hyperscaler, whether that’s Google, or Microsoft, or OCI, they are all building or deploying silicon based on Arm in some way.”
Awad says that currently, AWS has tens of thousands of customers running on Graviton, noting that more than 50 percent of the compute deployed by the company in the last two years has been Arm-based. He says that while there are arguably a number of different reasons why that might be, it is Arm’s power efficiency and low barrier to entry – “simplifying what it takes to go build silicon” – that has really set it apart.
“The world, moving forward, it really is a systems level world, and whether you’re talking about one of those hyperscalers, or you’re talking about a technology provider like Nvidia, [everything they’re building] is based on Arm because [with AI workloads] the primary workload starts on the CPU and is then offloaded on to the accelerator, whether it’s homegrown or not.
“It’s actually a very interesting position that we’re in. We’ve gone from 15 years ago, trying to catch up with the modern workloads, to now, in many cases, seeing those key AI workloads actually being optimized for Arm first… which is a great place to be in.”
The next big thing in silicon
When it comes to addressing the main challenges facing today’s data centers – in particular AI data centers – Awad says the line between power and performance has become “so blurry” given the scale that some of the AI systems are operating at.
“The two go hand in hand… if you have unlimited power, you can have unlimited performance, it’s just about how much silicon you want to throw at it.
“The reality is… power equals performance at the end of the day. However, right now, the thirst for performance is so great, it’s not like saving power actually means consuming less power, it just means adding more compute. That’s the world that we’re living in today.
“Now, that’s going to change at some point, but today, it’s really just about getting as much compute as you can, so wherever you can save power, it becomes incredibly important. And frankly, tearing the barriers down so that you can cram more performance into a given power envelope, that’s really what it’s all about.”
And from Arm’s perspective, these are all things which you just can’t do with off-the-shelf silicon.
“Infrastructure itself is evolving in a massive way right now, whether that’s AI or otherwise, and more and more of these big hyperscalers are looking at their data center infrastructure which, in the main, has been general-purpose commodity, off the shelf legacy architectures or CPUs, and realized that while that used to be good enough, it isn’t any more.
“Instead, it’s all about making sure that everything is optimized, as that’s the only way that they’re going to get the sort of efficiency that they need or scale to the level of performance they want.”
In 2025, the majority of investment in infrastructure is tied up with a very small number of technology providers, most of whom are working with Arm in what Awad describes as a “meaningful way.” He argues that this is due to the fact that when it comes to providing ever-growing levels of compute in today’s power-constrained environment, having customizable silicon that is designed around your data center or software stack, as opposed to a decade ago where you would “take whatever I had off the shelf and just try to cobble something together, which technically works, but is certainly not going to be as efficient.”
Awad says Nvidia’s Grace Hopper system is one of the best examples of this. Where, historically, a number of GPUs would be placed on a PCIe bus and connected to a general purpose x86 processor “which works, but you’re limited in terms of the memory you have access to, and the connectivity between those devices based on that PCIe interface,” Grace Hopper, and by extension Grace Blackwell, creates a coherent memory domain between the CPU and the accelerators, providing a much higher speed link between those two devices.
For the myriad of companies that have emerged in recent years, professing a desire to take on the incumbents and do “the next amazing thing with silicon,” Awad says that there are three things which set Arm apart: the performance and efficiency of its designs, being part of a broad ecosystem that people can easily leverage, and providing customers the freedom to customize and augment the hardware with capabilities like acceleration, interfaces, or connectivity.
He adds that the company’s success has endured because of its reputation for working with so many hyperscalers, making new customers feel secure that what Arm has to offer will be here for the long term, as so many big names have staked everything on building hardware based on its IP.
However, while Awad acknowledges that innovation takes time, he says that the industry is already starting to see it emerge, largely driven by the promise of having infrastructure that is optimized from the point where data is onboarded all the way through to the data center.
“The dream is to get to the point where workloads easily migrate from one end of that infrastructure to the other, based on where it can be processed most efficiently or where the compute capacity exists,” he says. “That level of sophistication is only going to happen with AI.
“We’re marching towards that. I’m not saying it’s going to happen in 2025 but that is certainly where the world is going to head long term, and it’s the only way you can really bring out the necessary levels of efficiency and performance from every ounce of compute, because at the end of the day, technology that’s sitting idle isn’t doing any money any favors.”
This feature first appeared in the DCD AI Hardware Supplement. Register here to read the supplement free of charge.
More in IT Hardware & Semiconductors
Read the orginal article: https://www.datacenterdynamics.com/en/analysis/how-arm-is-building-infrastructure-for-a-systems-level-world/