The world of business loves a metric. One single number that can tell a story quickly and easily – even if context and nuance can be lost in the process.
For years, the data center industry focused on PUE – a simple number to tell you how efficiently a data center was operating. It served the industry well and allowed operators to quickly show the world how much better it has gotten.
But as we move to a world of liquid cooling and AI, where PUE gains are slowing, is token efficiency the only metric in town for AI-focused companies?
The death of PUE?
Power usage effectiveness (PUE) – long the ‘north star’ metric for data center efficiency – has its drawbacks, but was always an easy shorthand for the industry to show its gains.
Since it was first proposed by Christian Belady and Chris Malone and promoted by the Green Grid in 2006, the industry average for PUE has dropped from around 2.5 in 2007 to less than 1.4 today. For liquid-cooled new-builds and hyperscale facilities, PUEs are regularly said to be at or even below 1.1.
It’s popular because it is easy to calculate – simply the ratio between total facility power and power consumed by the IT load – and it provides a simple single metric that can be widely understood by non-technical people.
But there are few gains left to be had for any newly-built facility when it comes to PUE. Any new facility using liquid cooling will likely be around the 1.1 mark. And total-power usage effectiveness (TUE) – a metric designed to better measure data center energy efficiency – is yet to really take off as a meaningful data point at scale.
The next big metric the industry zoned in on was water usage effectiveness (WUE), a way for data centers to measure their water efficiency. Measured in cubic meters of water per megawatt-hour of energy (m3/MWh), it compares water used for cooling with energy consumed by IT equipment.
As water use became more of an issue within the industry – driven in part by climate change but mostly as a way to show sustainability credo amid pushback from residents and officials – WUE highlighted how thirsty facilities could be in dry climates reliant on evaporative cooling. Improvements here are still ongoing, but often stymied by design choices made years ago.
Fully closed-loop systems only rely on a one-time injection of water, however, with the fluid continuously circulating and being reused. In theory, such systems have a WUE of almost zero. While not industry-standard, such closed-loop or otherwise waterless designs are becoming far more common as operators look to get new projects through the planning process amid increased scrutiny.
So what’s the next metric that matters?
Time for tokens?
Amid an AI boom and a decreased focus on sustainability during a time of right-wing popularism, is a new, overly simple metric set to step up for the data center industry to focus on? Enter, tokens.
During a recent event held in Buffalo, New York, Schneider Electric executives referenced tokens dozens of times, highlighting how the likes of hyperscalers, neoclouds, and any enterprise or operator looking to deploy high-end AI hardware are focused on eking the most tokens per watt and/or dollar out of their GPUs.
In AI parlance, tokens are small units of data that can be processed by models like ChatGPT or Claude. These transformer models take inputs such as text, images, audio, or video, and break that data down into tokens for processing – both training and inference. The faster tokens can be processed, the faster models can learn and respond.
When training models reach billions or trillions of tokens and have millions of customers making inference requests, token efficiency can have a large impact on the time and cost (and energy) required to train a model or process large numbers of requests.
Nvidia often talks about the cost of tokens and efficiency gains when promoting its latest hardware. AI firms often charge by the token, so better token efficiency means better margins on all those expensive GPUs and accompanying data center capacity.
Tokens have become the shorthand for how effective models and the hardware running them are at making their owners money. People love a simple metric, and “better token efficiency = more money” is about as simple as it can get.
PUE and WUE tapped out?
During the presentation, Schneider outlined the importance of tokens per watt as a metric (tokens divided by watts consumed, with higher numbers being better) as well as cost per token (total cost of compute, energy etc. divided by number of tokens produced, lower is better).
While a better tokens per watt number means an operator is at least operating its hardware and software efficiently, it doesn’t take into account much else. A data center with a good tokens per watts score powered via cheap natural gas might be good on paper, but not exactly environmentally friendly.
But as the industry sees fewer and fewer gains to be had on measures like PUE and WUE, token efficiency might be all that’s left where companies can see meaningful gains.
During its presentation, Schneider also outlined PUE and WUE scores for two theoretical data centers in Dallas, Texas, and Paris, France, highlighting the differences depending on air-cooled versus liquid-cooled and open-loop versus closed-loop water systems.
In all the scenarios, PUE for these theoretical new builds would be 1.15 or below, with most lower than 1.1. And WUE scores for the closed-loop liquid-cooled facilities would be essentially be zero as they only need to be filled once and replaced very occasionally.
In the real world, there are many more variables, but in Schneider’s theoretical scenario, adopting liquid cooling and closed-loop water systems on the most energy-efficient kit means the only metric that really shifted with any significance is token efficiency, which varied significantly in different scenarios. And if tokens means dollars, that is the metric many operators will focus on when building out facilities.
“I don’t think anybody has tried,” Manish Kumar, Schneider Electric’s EVP for secure power and data centers, tells DCD when asked about the need for a simple metric that combines tokens with sustainability. “We actually are deeply thinking about it; it’s something we need to look at.”
Kumar notes there has probably been a “changing of the narrative” as the business world lessens focus on sustainability and places more focus on tokens.
“The more you reduce the tokens per watt, the more competitive your business becomes,” he says. “So maybe it will not be reviewed through the lens of only sustainability, but also as a pure economics and competitiveness.”
But he says the company “still thinks” energy efficiency is important in data centers, given the amount of electricity data centers are going to consume.
“I don’t know where we will settle. Tokens per dollar will continue to be a focus area, but sooner or later [something will come].”
Number get bigger
While understanding token efficiency is important, it shouldn’t be all that matters.
Some companies are pushing employees to burn through as many tokens as possible – aka tokenmaxxing – in an effort to either get staff to train systems and maybe find some genuine value, or merely show they are part of the AI boom regardless of the actual benefit. The latter – real-world instances of the ‘number get bigger’ meme – is being played out for dollars at many of the world’s biggest companies.
Nvidia CEO Jensen Huang has previously spoken of his deep concern if engineers aren’t regularly working their way through thousands of dollars worth of tokens every year.
Some reports suggest employees at major firms like Amazon and Meta are actively inflating AI token consumption on needless tasks to hit internal usage targets to appear higher on internal leaderboards. DCD has heard of knowledge workers at smaller companies using AI systems to do ‘important’ business tasks, such as check train routes and generate pictures of their pets, in order to help keep those leaderboard scores where management wants them.
Despite the company being a major AI user, Uber president and COO Andrew Macdonald recently warned that the taxi giant has yet to find a link between higher AI token usage and an increase in useful consumer features. This was days after Uber CTO Praveen Neppalli Naga said the company’s engineers had already blown through its 2026 Claude Code budget by April.
Software firm ServiceNow has also reportedly blown through its annual Claude budget allocation for the year in less than five months. Good news, perhaps, for those that charge by the token, not so much for budget-constrained enterprises unless prices come down drastically.
The death of PUE has long been foretold, but the old dog keeps on ticking because it tells a simple story quickly. While it definitely helped the industry think about – and be better – when it came to energy efficiency, some argue it was too simple and led to tunnel vision.
Token-based metrics are equally simple. And they can be very directly tied to revenue more than PUE. It’s too early to say, but let’s hope focus on wider efficiency and sustainability aren’t left by the wayside in a tunnel-visioned pursuit of ever-better token returns and better profit margins.
Sustainability in the age of AI and neoclouds has taken enough of a knock in recent times. The industry quickly accepted natural gas as a primary power source – anyone saying it’s a short-term bridge to hydrogen or nuclear is either lying or overly optimistic – and rumors are circulating that some hyperscalers might drop their 2030 carbon neutrality pledges. Let’s not get over-zealous on producing infinite tokens unsustainably in the short-term hunt for a dollar.
More in Infrastructure Management
Read the orginal article: https://www.datacenterdynamics.com/en/analysis/are-tokens-the-only-data-center-metric-that-matter-in-the-age-of-ai/












