What Is an AI Factory, Actually?

For fifteen years, the tech industry told us our data lived in “the cloud,” a word chosen, as media scholar Tung-Hui Hu has documented, precisely because it makes infrastructure invisible. The cloud is “silent, in the background, and almost unnoticeable,” Hu writes. “Atmospheric and part of the environment.”

Then, in March 2026, journalist Matteo Wong drove through southwest Memphis to visit what he described as a facility where Elon Musk intends to “build a god.” Thirty-five natural-gas turbines roared behind white walls bigger than a dozen football fields. If run at full capacity for a year, this single facility, xAI’s Colossus, would consume as much electricity as 200,000 American homes.

Nobody calls that a cloud.

Jensen Huang, NVIDIA’s CEO, has a different word for it. Since GTC 2022, he has been calling these facilities “AI factories.” The distinction matters because “AI factory” is not a technical term. It is three claims bundled together: a technical claim about what these buildings are, an economic claim about what they produce, and a political claim about who controls them. Understanding which claim someone is making when they use the phrase changes how you think about the next decade of technology.

Where the term comes from

NVIDIA coined the phrase at GTC 2022, timed to the launch of the Hopper H100 GPU. By GTC 2026, Huang had formalized the economics: an AI factory’s revenue is a function of tokens per watt multiplied by available gigawatts. The term existed earlier in a softer form: in 2020, HBS professors Marco Iansiti and Karim Lakhani used “AI factory” in the Harvard Business Review to describe an organizational model, not a building. No formal IEEE or ISO definition exists. The European Union adopted the term for its EuroHPC program, establishing 19 “AI Factories” across Europe, though these are quite different from what NVIDIA means by the phrase.

What actually makes it different

Traditionally, a data center is closer to a warehouse. It stores data and serves it back when requested. Performance is measured in uptime and throughput. An AI factory is closer to a production line. It takes in electricity, data, and chips, and produces intelligence: models during training, tokens during inference. The physical differences follow:

	Data center	AI factory
Purpose	Store and retrieve data	Produce intelligence
Power density	5–15 kW per rack	40–120+ kW per rack
Cooling	Air cooling	Liquid cooling
Networking	Standard ethernet	InfiniBand, low-latency fabric

These are not cosmetic differences. Liquid cooling exists because GPUs generate so much heat at these power densities that air cannot remove it fast enough. InfiniBand exists because training a large model requires thousands of GPUs to communicate at microsecond-level latency to stay synchronized. The facility is overwhelmingly optimized for a single purpose: converting electricity into intelligence as efficiently as possible.

The economic claim

When Huang says “compute equals revenues, compute equals GDP,” he is making a specific argument: that the output of AI factories behaves like an industrial commodity. The numbers suggest scale: a 100-megawatt AI factory costs roughly $4 billion to build. NVIDIA’s Blackwell B200 chips sell for $30,000 to $40,000 each. The average enterprise AI budget has grown from $1.2 million per year in 2024 to $7 million in 2026, with inference accounting for 85% of that spending.

But here is where the factory metaphor strains. In a traditional commodity market, oversupply drives prices down. More steel means cheaper steel. Tokens do not work this way. A token from Claude Opus costs roughly sixty times more than a token from DeepSeek. Same unit of measurement, wildly different value. The value is not in the token; it is in the capability of the model that produced it. Tokens are less like steel and more like wine: same volume, completely different product.

This means AI factories do not have standardized output. Two facilities with identical hardware running different models produce tokens of different value. The quality, reliability, and capability of a token depends entirely on which model produced it, not which factory made it. The factory metaphor, which implies uniform, commoditized production, obscures this.

The political claim

This is the claim that nobody states explicitly but that the numbers suggest.

The United States holds an estimated 75% of tracked global AI compute capacity. China holds about 15%. Europe, home to ASML, currently the sole supplier of the machines required to manufacture advanced chips, has roughly 5%. A single US hyperscaler campus outspends the EU’s entire sovereign supercomputing investment through 2027.

The chip supply chain is even more concentrated. TSMC in Taiwan manufactures the vast majority of the world’s most advanced AI chips. ASML in the Netherlands makes the lithography machines needed to produce them. NVIDIA designs the dominant GPU architectures. This three-node dependency, Netherlands, Taiwan, United States, largely determines who can build AI factories and where.

NVIDIA uses “AI factory” to mean: privately owned, commercially operated, producing tokens for revenue. The EU uses it to mean: publicly funded, sovereignty tool, ensuring national AI capability. Same term, opposite political valence.

What the metaphor reveals, and where it breaks

Where “cloud” hides materiality, “factory” displays it. A factory has inputs, outputs, costs, waste, and owners. It sits on land, draws from the grid, and heats the air. By choosing the word “factory,” the industry, perhaps inadvertently, made its infrastructure visible again. And visibility is the precondition for accountability.

But the metaphor has limits. A factory implies standardized output, predictable economics, and a stable production process. AI factories have none of these. Their output varies by model. Their economics are subsidized: OpenAI generated $3.7 billion in revenue in 2024 while spending roughly $8.7 billion, about $2.35 for every dollar earned. Their hardware loses its competitive edge for frontier training within 18 to 24 months as new architectures arrive, though older GPUs can be cascaded to inference workloads.

And not every workload needs a factory. The push toward smaller, efficient open-weight models that run on consumer hardware represents a counter-trend worth watching.

The question this leaves open

The term “AI factory” is just over four years old. In 2026 alone, hyperscaler capital expenditure on AI infrastructure is projected to exceed $600 billion, rivaling the inflation-adjusted cost of the entire US interstate highway system built over four decades.

The spending is real. The infrastructure is real. But the framework for thinking about it has not caught up. “AI factory” is useful shorthand, but it papers over the hard questions: who owns the output, who sets the price, who decides access, and what happens when a country’s AI capability depends entirely on infrastructure it does not control.

These are not utility questions or commodity questions. They are dependency questions. And for anyone building a business, a product, or a country’s technological future on AI, understanding what an “AI factory” actually is, not the marketing term, but the economic and political reality underneath it, is the starting point for answering them.

Notes

Tung-Hui Hu, "A Prehistory of the Cloud" (MIT Press, 2015).

Matteo Wong, "Inside the Dirty, Dystopian World of AI Data Centers," The Atlantic, April 2026 issue (published online March 2026).

NVIDIA GTC 2022 keynote blog, "Turning Data Centers into AI Factories."

Jensen Huang, GTC 2026 keynote; Computerworld, "Huang talks up tokenomics," March 2026.

Iansiti & Lakhani, Harvard Business Review, 2020.

EuroHPC AI Factories page; EU Council press release, January 16, 2026.

Semianalysis; JLL data center cost guides; Tom's Hardware, "The data center cooling state of play," 2025.

Jensen Huang, Morgan Stanley TMT Conference, 2026.

Semianalysis GPU price index and cost modeling.

CNBC, Jensen Huang on B200 pricing, March 2024.

AnalyticsWeek, "Inference Economics," 2026; Oplexa, "AI Inference Cost Crisis 2026."

Claude Opus at $25/M output tokens; DeepSeek V3 at $0.42/M. Ratio ~60x.

Epoch AI GPU cluster tracker (74.5% US, 14.1% China, 4.8% EU by cluster performance), late 2025 estimates. Dataset covers 10-20% of total global AI compute.

Microsoft Wisconsin: $7B campus investment. EuroHPC: ~€10B including member state co-funding through 2027.

Chris Miller, "Chip War" (2022); TSMC market share from Reuters; ASML monopoly per ASML filings.

OpenAI 2024 financials: $3.7B revenue, ~$5B losses. The Information, Fortune.

Goldman Sachs AI spending projections; FHWA interstate highway cost ~$634B in 2024 dollars.