When you hear the term ‘supercomputer,’ for the uninitiated, it’s likely to conjure images of a machine akin to the Colossus computer used by British code breakers at Bletchley Park during the Second World War.
And, while some modern-day systems are undoubtedly impressive in their size – Elon Musk’s efforts to develop his own Colossus cluster have thus far seen him deploy more than 200,000 GPUs in Tennessee – for most systems, visually at least, the reality is far more underwhelming.
Despite the best efforts of British Airways to keep us grounded, DCD traveled to the High-Performance Computing Center (HLRS) in Stuttgart, Germany, to visit Hunter, the first of two HPE supercomputers ordered by the University of Stuttgart to take HLRS up to exascale level.
Brought online in January 2025, the €15 million ($17.5m) HPE Cray EX4000-based system comprises 512 AMD Eypc ‘Genoa’ processors, with the CPUs grouped into 256 nodes, each equipped with 768GB of DDR5-4800 memory. However, unlike other HPC systems housed at HLRS, Hunter is also powered by GPUs: 752 AMD Instinct MI300A accelerated processing units (APUs) across 188 liquid-cooled nodes.
Officially launched in December 2023, AMD’s Instinct MI300A combines 24 Zen4-based CPU cores with a GPU accelerator and 128GB of high-bandwidth memory (HBM3) on the same silicon. This architecture allows both the CPU and GPU to access the same memory, which AMD says saves trillions of calculations from going through unified memory and speeds up the performance of HPC workloads.
Each APU has 24 Zen 4 CPU cores and offers 128GB of HBM3. AMD has also used a 3D stacking design for the chip, allowing it to integrate several smaller silicon dies to form a larger processor.
Hunter has a theoretical peak performance of 48.1 petaflops (more on that later), and each of its nodes is equipped with four HPE Slingshot high-performance interconnects. It ranked 54 on the most recent edition of the Top500 list, and 12 on the Green500, and provides double the speed while consuming 80 percent less energy than its predecessor, Hawk.
When it comes to Hunter’s architecture, Utz-Uwe Haus, head of HPC/AI EMEA research lab, at HPE, describes the Cray EX design as “the architecture that HPE, with its great heritage, builds for the top systems.”
A single cabinet in an EX4000 system can hold up to 64 compute blades – high-density modular servers that share power, cooling, and network resources – within eight compute chassis, all of which are cooled by direct-attached liquid-cooled cold plates supported by a cooling distribution unit (CDU).
“It’s super integrated,” he says. “The back part, which is the whole network infrastructure (HPE Slingshot), matches the front part, which contains the blades.”
For Hunter, HLRS has selected AMD hardware, but Haus explains that with Cray EX systems, customers can, more or less, select their processing unit of choice from whichever vendor they want, and the compute infrastructure can be slotted into the system without the need to total reconfiguration.
“Should HLRS decide at some point to swap [Hunter’s] AMD plates for the next generation, or use another competitor’s, the rest of the system stays the same. They could have also decided not to use our network – keep the plates and put a different network in, if we have that in the form factor. [HPE Cray EX architecture] is really tightly matched, but at the same time, it’s flexible,” he says.
Hunter itself is intended as a transitional system to the Herder exascale supercomputer, which is due to go online in 2027. A new data center is currently under construction at HLRS ahead of Herder’s planned deployment, as reinforcing the floor in the data hall where Hunter is housed to have the two systems sit alongside each other is not possible.
Despite all this, the colorful and rather powerful system looks a little lost sitting in the corner of a data hall, surrounded by all the empty space that Hawk used to occupy, and the center’s Vulcan NEC cluster – a “standard PC cluster” available to the University of Stuttgart for research, visualization, AI, and big data workloads.
Simply the best
HLRS was established in 1996 as Germany’s first national high-performance computing (HPC) center, providing users across science, industry, and the public sector with access to supercomputing resources for “computationally intensive simulation projects.”
It is one of three national computing centers in Germany, the others being the Leibniz Supercomputing Centre (LRZ) in Garching, near Munich, and the Jülich Supercomputing Centre (JSC) in Jülich, North Rhine-Westphalia. There are also a number of smaller supercomputing centers in the country, which are overseen by the Gauss Alliance, a non-profit association for the promotion of science and research, and for which Prof. Dr. Michael M. Resch, director of HLRS, is a board member.
Getting a supercomputer like Hunter from concept to reality is a process that requires both planning and patience, in addition to expertise, a long-term business model, and an investment structure that includes a whole ecosystem of hardware, software, and infrastructure to go alongside it.
Resch says HLRS has been deploying systems comprising AMD hardware since 2002, yet, every time the center starts to discuss building its next system, a vendor-agnostic approach is taken – when asked if Herder will also be comprised of AMD hardware, Resch says no decision has been made yet, and negotiations will be ongoing until the end of 2025.
“We are a national supercomputing center and we need to buy a new system every now and then,” he says, adding that when this happens, the center goes through a rigorous procurement process, which involves lots of long discussions with vendors.
“We went for the best solution. I was asked… whether European was important, and the answer is no. We want to have the best system, and we don’t care [where it comes from] – we are very agnostic when it comes to company names or nationality. We have around 800 users out there, and they need the best solutions.”
Resch says that in the world of HPC, the best solution means the one that offers sustained performance.
“From time to time, I get the question ‘Why is your peak performance… not as high as this or that [system]?’ And I say: ‘I don’t care.’
“Peak performance, ladies and gentlemen, is like taking a car, putting it on a plane, and then taking off and throwing the car out of the window, and then measuring the speed of the car. That is a ridiculous number, much as peak performance is ridiculous, and the colleagues here, they know that, but not everyone in the market knows that. Peak performance is not relevant; the question is ‘how much do you get of this performance?’”
Consequently, HLRS has eschewed traditional benchmarks and instead sets vendors a test that consists of running three of the center’s production codes.
“This is much more exciting,” Resch jokes.
In addition to sustained performance, cost is a big factor. As Resch says, it’s all very well and good for the President of the European Commission to say there will be hundreds of billions of euros worth of investment in a certain area when ultimately, they aren’t putting up the money or paying the final bill.
He explains that there’s always a push and pull between “make or buy” when it comes to weighing up these costs.
“When you have your own supercomputer, you have a huge investment. You’re spending tens of millions just to make sure that the system is up and running, and then, after five years, you spend another 10 million to buy a new one.
“But, on the other hand, the advantage is that you have a higher level of flexibility, in the sense that it’s your own decision of what happens with it. [At HLRS], we will buy a system every five years or so, [but if users] want to have another system in two years, we will say, ‘Sorry, we don’t have another system. You’ll have to wait for another three years.’ So, that level of flexibility is there.”
However, when it comes to a question of cost and profit, because HLRS is a national center, it can only charge its users what it costs for the center to run the system – “no more, no less” – as doing anything different would be illegal. HLRS is also not allowed to provide subsidized compute to any particular user or institution.
Consequently, HLRS is cheaper than cloud providers, particularly when it comes to offering compute to non-cloud native workloads that are tightly coupled or require longer durations, as it does not have to take expenditure recovery or profit margins into account when setting its pricing strategy.
Furthermore, while commercial hyperscalers must provide availability guarantees of 99.99 percent as part of their service level agreements, HLRS does not have such an obligation, meaning the center does have to spend money on UPS (uninterruptible power supply) units or generators.
“Maybe there are hospitals using [an unspecified cloud provider], so it cannot have a power outage. [The provider] must therefore have an uninterruptible power supply, have diesel engines, and have all kinds of things to make sure that it is absolutely safe.
“We don’t do that, and we can’t afford to do that because it’s public money that no government will accept. If I said: ‘We are at 99.8 percent, but for the extra 0.2 percent, I need an additional €50 million ($58m) for the power supply, for the diesel,’ the center would say ‘no, we don’t need that.’ That’s something which our customers accept.“
For this reason, Resch says HLRS will not accept requests from institutions such as hospitals, banks, or insurance companies as they cannot guarantee the level of uptime needed by such customers.
The future of HPC
While Resch said that no final decisions had been made yet regarding Herder’s likely hardware, if the center were to select AMD again, the supercomputer would be in highly esteemed company.
On the most recent edition of the Top500 list, around 34 percent of supercomputers were AMD-powered, with the chip company also responsible for powering the top two exascale systems on the list, El Capitan and Frontier, which took the first and second spots, respectively.
The June 2025 edition of the Top500 also represents the seventh consecutive list on which the world’s most powerful supercomputer has been powered by AMD.
However, despite his evasiveness about Herder, during the visit, Resch did let slip the existence of a previously unannounced AMD MI600 AI chip, demonstrating just how advanced the planning is for AMD’s hardware offerings, and that the company is already providing customers with technical details for chips that are multiple years out from production.
“In 2023, we signed a contract for the year 2027, and that was a risk on the vendor side and on our side.”
Resch says that as part of that agreement, AMD and HLRS agreed they would sit down in the last quarter of 2025 to discuss the path going forward, and have the chip maker disclose the specifications of the chip it’s planning to bring to market in 2027.
“Four years ahead of delivery, we don’t get detailed specifications, so now, we need to look into these detailed specifications, talk to HPE about it, and find out if this changes anything with regard to the overall system. I believe these negotiations will be very short and very nice, but right now, I cannot say in advance what the best solution is.”
Peter Rutten, global research lead on performance-intensive computing solutions at IDC, says that this is one area in which AMD has been “incredibly good” – the execution of its roadmap.
He adds: “With every new deliverable stage, AMD has basically outdone the expectations of its new generation processor, and so the market has just come to expect that what AMD does is always going to be spectacular, and they so far haven’t failed on that.”
Rutten says that while it’s not surprising AMD has continued to see success in the HPC space, it’s still a relatively new phenomenon for the company, with its traction in the market only starting around five to seven years ago.
There are a number of factors that have – and continue – to contribute to AMD’s dominance, one of which is cost, attributed to the company’s ability to provide high performance and efficiency, which is crucial for HPC centers with limited budgets.
“With HPC, the rule is the best performance in the most efficient way,” Rutten says. “Think of an HPC center. Think of their budgets. Think of what they’re trying to do. They’re trying to solve very tough scientific problems, but often, they have limited budgets, usually government or academia. So, how do you get the most performance with the least amount of expenditure? That is always the critical question for any HPC site, and AMD responded to that question convincingly better than their competition.”
One of the ways in which AMD was able to achieve this was by being the first company to manufacture 7nm technology, which they combined with a redesign of their processors in a way that made them more performant. Rutten says that as a result, the company was able to become a market leader in the HPC space, particularly with regard to intensive and technical engineering workloads, whilst also allowing it to establish itself as a player when it came to AI.
That’s not to say other companies aren’t also seeing success in the HPC space. GPU giant Nvidia, a company that has become synonymous with AI, is currently powering four supercomputers in the top ten of the Top500, with 70 systems listed deploying the company’s H100 GPUs.
Rutten also notes that Intel, which has had a turbulent couple of years from both a financial and hardware perspective, is still very much the market incumbent, powering the third-place exascale system Aurora and providing CPUs for the largest share, 294 systems or around 59 percent, of Top500, though this number is falling.
For AMD, that figure sits at 173 systems, around 34 percent, a modest increase from 32 percent six months ago.
Additionally, a total of 237 systems on the list are using accelerator or co-processor technology, up from 210 six months ago, an architecture that Rutten says has played a significant role in Nvidia’s success, particularly as an increasing number of AI supercomputers are deployed.
AI processing comes from the GPU’s ability to process things in parallel, as opposed to with HPC workloads, which often just require very fast serial processing, but not necessarily parallel. This meant that traditionally, server nodes didn’t come with a co-processor as standard, something Rutten says has now changed because it’s been shown that HPC workloads really benefit from that parallelization that GPUs provide.
“A server with a GPU in it also needs a processor, you can’t run a GPU without a processor also being present,” he explains. “Intel was having a decent market with servers that ran on Intel CPUs and Nvidia co-processors, but that has been changing because Nvidia has seen that AMD is also developing very performant processors, which is why in recent years, we’ve seen more and more servers with AMD processors as well.
“Nvidia has been very focused on AI, but, along the way, realized that HPC was also a very attractive workload for them,” Rutten says. “Initially, that was sort of an afterthought, but then it became actually an adjacency for them to focus on, and they have – to the point where a lot of supercomputers are now Nvidia GPU accelerated.”
While Rutten argues that Nvidia isn’t doing anything totally revolutionary, rather just taking a what is “essentially a newer approach to HPC,” he says that the old way, with just CPUs and tightly connected server nodes, had a lot of practitioners that were very skilled in optimizing HPC environments, one of the most difficult but critical jobs in any supercomputing lab.
“There are people who have PhDs, but with these GPUs now becoming part of the supercomputers, that skill set has changed, and that traditional way of thinking is being challenged. There are now different considerations going into how you optimize a supercomputer, rather than just what was involved with a non-accelerated supercomputer.
“I don’t think we have fully gotten to the point yet where the skills to optimize an accelerated supercomputer are as advanced as the skills to optimize a classical supercomputer, although we’re getting there. But, it has been a little bit of a challenge for HPC sites to understand what they could do with GPUs and how to do it.”
This feature first appeared in DCD>Magazine #58. Register here to read the whole magazine free of charge.
Read the orginal article: https://www.datacenterdynamics.com/en/analysis/how-to-build-a-supercomputer/











