The Government Communications Headquarters – better known as GCHQ – is one of the most mysterious government departments in the UK.
The agency, which handles signals intelligence and monitors communications similarly to the US’ NSA, has long been dramatized along with its human intelligence counterparts MI5 and MI6 in film and television.
There is a great deal of intrigue about the agency and the work it does. After all, who doesn’t love a secret?
Among those ‘secrets’ is the digital infrastructure GCHQ relies on to conduct its operations. While unable to get into the nitty-gritty of the exact technology storing the country's confidential information, DCD was able to corner GCHQ’s ex-CTO Gaven Smith at the October DTX event in London and subject him to a light interrogation.
Smith’s sheer joy of his years at GCHQ – a total of 30, with the last seven as chief technology officer – was plain to see. Having left, he now holds positions as a non-executive director at Beyond Blue Limited and Interrupt Labs, as well as a part-time professorship at the University of Manchester.
“I think it's the best job in the country for an engineer – which is controversial with a lot of other engineers,” he says. “Running the technology half of GCHQ is a massive privilege, and you are conscious that today’s Alan Turing could work in the organization that you are responsible for.”
Turing looms large over GCHQ and the technology it uses. The agency was originally established after the First World War, then known as the Government Code and Cypher School. But it is perhaps best known for the work it did at Bletchley Park during the Second World War.
Dramatized in the 2014 film The Imitation Game, staff including Turing used the Colossus computer – a machine weighing several tons and made up of thousands of vacuum tubes – to analyze vast sets of communications data from the Nazi military and help crack their codes. That work is estimated to have shortened the war by several months.
Since the war was won, the technology propping up GCHQ has evolved significantly, and “computer” barely has the same definition.
Supercomputers and data centers
Today, GCHQ utilizes a mixture of on-premise data centers, and in the cloud. “GCHQ still has lots of different data centers – the biggest one is at headquarters – and these are big operations,” says Smith.
The headquarters in question is affectionately known as The Doughnut. GCHQ had been housed in two sites on opposite sides of Cheltenham since 1952, but in 1997 decided to redevelop its accommodation, signing a contract with IAS Limited in June 2000 for the new headquarters in an agreement set to last thirty years.
The move to the new location faced significant backlash and the National Audit Office conducted a review of the move in 2003 which found that, while GCHQ had budgeted £40 million ($51.93m) for the technical transition, its estimated cost rose to £450 million ($584.22m) – today that would be worth around £891.65 million ($1.15bn).
The data center that lies under GCHQ has long been the subject of speculation and rumors. While Smith confirmed its existence and said that its data halls are home to supercomputers, he declined to share detailed information on the compute power and hardware that lies in that facility.
“It’s a mixture of things. It has lots of very boring computers – similar to what you’d see in any data center – but there are also supercomputers,” Smith says. “GCHQ does use supercomputers – some of the problems it has to solve require that kind of compute.”
He offers Colossus at Bletchley Park as a comparison: “Essentially, the computing estate does the modern equivalent, plus a bunch of things that help analysts make sense of data,” he explains. “Sometimes those are really simple analytics – searching for something, or sometimes you are looking for patterns in data.”
In a follow-up email to GCHQ from DCD, the agency responded with a blanket “neither confirm nor deny” response to all queries, including regarding the compute capacity that lies in the bowels of the building.
While hard facts are difficult to come by, speculation is abound. In a 2014 book – Shaping British Foreign Defense Policy in the Twentieth Century – R.J. Aldrich, a professor of International Security at the University of Warwick, noted: “The exact size and type of these computers are secret, but GCHQ is rumored to have several machines each with a storage capacity of 25 petabytes (25,000 terabytes) equipped with over 20,000 cores to provide rapid parallel processing. Such computers are required for only a few specialist scientific tasks: simulating complex weather systems, mapping the human genome, designing nuclear weapons, and of course cryptography – the science of making and breaking ciphers.”
That prediction was made over a decade ago. In 2014, the most powerful recorded supercomputer was the Tianhe-2 which had a performance of 33.86 petaflops on the Linpack benchmark. At the time of publication, the number one spot is held by El Capitan, with 1.742 exaflops of performance. With or without confirmation, it’s likely the compute power GCHQ has at its disposal has also grown exponentially in the last 10 years.
The quantity of data that GCHQ handles has also increased. In 2016, the Investigatory Powers Bill – later rebranded as the Investigatory Powers Act 2016 (IPA) – came into force, and made significant changes to the pool of data GCHQ could access, and how.
“GCHQ has access to a lot of data – it's authorized to do that,” Smith says. “There are rules – it's got to be necessary, it got to be proportionate, and it’s got to be authorized.”
The Investigatory Powers Act brought together the powers of law enforcement and the security and intelligence agencies to obtain communications and data about communications. Within strict regulatory boundaries, the agencies can effectively pool their resources, though access to this data and communications is, of course, still restricted.
Smith explains that accessing that information is held within a “double lock,” so every warranted access is signed off by the Secretary of State and has to be approved by an independent Judicial Commissioner from the Investigatory Powers Commissioner’s Office.
For an interception warrant to be issued, the data can only be accessed if it's in the interest of national security, the economic well-being of the UK, and to support the prevention or detection of serious crime, and the data accessed be proportionate to the need.
According to Smith, even that process in itself can lead to significant amounts of data but “the policy, the ethics, and the legalities are super important to everybody that works there. We’re all acutely aware of our responsibilities. It's an intrusive power.”
Despite this, the IPA – and GCHQ’s past behavior – has been controversial.
In 2013, reports emerged that GCHQ, along with the US National Security Agency (NSA,) had cracked a lot of the online encryption used to protect people's personal data, online transactions, and emails according to documents revealed by whistleblower Edward Snowden.
The agencies were accused of using “covert measures” to set international encryption standards, supercomputers to break encryption with “brute force” and collaborating with technology companies and Internet service providers to put in backdoors. There was allegedly a GCHQ team dedicated to finding access to encrypted traffic on Hotmail, Google, Yahoo, and Facebook.
While these accusations precede the IPA, the IPA in itself has been criticized as enabling more violations of privacy.
Liberty Human Rights group took a case to the UK High Court against the IPA in 2017, arguing that it intruded upon the private life of individuals and interfered with the rights of journalists and lawyers to communicate confidentially with sources and clients.
The court ruled in favor of the IPA. It upheld its legality in 2019, however following Liberty’s appeal in 2022, it was revealed that MI5 had unlawfully mishandled personal data leading to a separate case being started. In 2023, the Court of Appeal ruled in favor of Liberty that sharing data from bulk personal datasets with overseas states was unlawful.
In 2024, some safeguards were added to protect journalists from having confidential journalistic material accessed by state bodies “easily.”
Regardless of the ethics and legality of the act and mass surveillance, when it comes to data storage, the answer is predictably complicated.
“It’s a mixture of things,” says Smith of the agency’s data storage capabilities. “It's a phenomenally complex infrastructure under GCHQ – made up of hundreds of thousands of systems. Some of this is cloud-based, and some are on-prem. It is large scale.”
GCHQ, along with MI5 and MI6, signed up to be an Amazon Web Services (AWS) customer in October 2021 with plans to host “top secret material” on the cloud platform.
At the time of the announcement, the data was said to be held in AWS data centers in the UK, and was hoped to help share data internally more easily including enabling the agencies to search each other's databases faster.
Response to the contract was mixed, with many expressing concern over the advisability of putting such sensitive data into the hands of a US-based private company. At the time, Gus Hoesin, executive director of Privacy International, told the Financial Times: “If this contract goes through, Amazon will be positioned as the go-to cloud provider for the world’s intelligence agencies. Amazon has to answer for itself which countries’ security services it would be prepared to work for.”
Having contracts with the CIA in place since 2013, AWS notably scored a huge contract with the Australian Government in 2024 to develop a data center that would host “top secret” data in the country.
Smith notes that moving to the cloud is not a perfect solution in all circumstances. “Lots of organizations are trying to get out [of on-prem], but when you run highly classified infrastructure, you’ve got to put it in secure spaces,” he says.
As for the department’s use of cloud computing, Smith emphasizes the importance of trust in your provider. “When it comes to the use of the public Internet, clearly we are going to follow the National Cyber Security Centre (NCSC) guidances about doing that in the same way any government department does,” he says. “You have to know where your data is, you have to be happy with the end users' licensing and that tends to mean using known services that you have a degree of trust with – for example, Amazon, Microsoft, and Google.”
Follow-up questions for GCHQ regarding its use of cloud computing were similarly met with “neither confirm nor deny” responses.
Future technologies
GCHQ is already an established user of AI – something the agency has been relatively open about.
“There used to be a thing about ‘how to catch a terrorist,’ and it talks a lot about the analytic processes that go into finding patterns in data and looking for the known unknowns and the unknown unknowns in data. Of course, increasingly, that's about AI,” says Smith.
In 2021, the agency released a paper titled Pioneering a New National Security: The Ethics of Artificial Intelligence, in which it laid out its AI strategy, noting that “an increasing use of AI will be fundamental to GCHQ’s mission of keeping the nation safe.”
According to that strategy, AI is mostly used to deal with large amounts of data and solve “well-defined, narrow problems” that are too time-consuming to be handled by a human alone.
Some of this will be in tackling cyber security breaches through the NCSC. AI will be able to identify malicious software by analyzing activity on networks and devices at scale, identifying patterns, and then updating its “known patterns” of malicious activity.
What differs in the agency’s use of new and emerging technologies, however, is its level of transparency. The strategy recalls Winston Churchill’s description of the veterans of Bletchley Park as “geese that laid the golden eggs and never cackled.” The work of Bletchley Park remained a secret for decades, but Smith says there is an increasing attitude of openness.
“GCHQ has been open about its use of AI, which I think is a good thing,” he argues. “We should be talking about the things we do so that people understand. “One of the things I was responsible for [as CTO], was getting us onto the Internet. So you can see there are around 60 GitHub software development projects, and I think that's important for a bunch of reasons.”
Smith argues that, by increasing transparency, the codes used are actually improved: “It’s a sort of retention tool for software developers. You write better code if your peers are going to review it and publish it to the Internet.”
Additionally, if people then download and use that software for other purposes, Smith says it is a way of introducing “cybersecurity and resiliency” where it may have previously been lacking.
This attitude extends beyond AI, and into the world of quantum computing. “One of my first public speeches was about quantum computing,” recalls Smith, adding that it was probably what brought him to public consciousness. “I did a speech at a quantum showcase, and I spoke about what GCHQ is doing. It basically said: we didn’t talk about Colossus for 50 years, we didn’t talk about [encryption algorithm] RSA and [its inventor] Clifford Cocks for 25 years, so let’s talk about quantum now. Let’s admit that quantum is a really important issue for us now – and most important is being quantum safe.”
In October 2024, the UK government opened the National Quantum Computing Centre (NQCC), located in Harwell, Oxfordshire. The site is set to house 12 quantum computers, with the likes of Infleqtion and Rigetti known to be involved.
Quantum is mostly in the remit of the NCSC, which is looking at quantum-safe algorithms that can resist quantum machines powerful enough to crack RSA encryption, which has become the standard method of protecting data.
“Nobody really knows when that will be,” admits Smith. “It might be in five, 10, or 15 years, but it will certainly be in our professional lifetimes. So the most important work GCHQ is doing is to support the cybersecurity effort around quantum.”
Beyond that, the agency is investing in quantum computing for data analysis. Quantum computers, still under development, come in many forms and rely on a variety of technologies. Which type, or types, will become dominant in the market, so GCHQ is hedging its bets.
“There is some early-stage research going on – this is where the National Security Strategy Investment Fund comes in,” Smith says. “There’s quite a lot [of quantum technology] on our doorstep in the UK, which has a really exciting quantum computing ecosystem. But I don’t think anyone has decided which quantum computer is going to be the solution yet, so you’ve got to look at them all.”
Smith recalls that Sir Patrick Vallance, when he was chief scientific officer, used to call this attitude “optionality.”
“I would call it spread betting,” he says. “Nobody knows which technology will win out, but it matters so much for us to use these capabilities to keep us safe, and organizations like GCHQ have to be at the cutting edge of that to use it and advise the government.
“I just wouldn’t want to pick the wrong horse.”
Read the orginal article: https://www.datacenterdynamics.com/en/analysis/handling-a-nations-confidential-data/