Alchemy

Part III · The Physics of Thought

III.B — The Room to Run

6 min read · 1,075 words

The Landauer floor established in the previous section is far below current practice—six orders of magnitude below. Seth Lloyd calculated that the ultimate physical limits lie thirty orders of magnitude beyond current systems.1 But these numbers belong to physics, not economics. The relevant ceiling is set by engineering constraints that bind long before thermodynamics does—and by economic constraints that bind before engineering. The question is whether the headroom translates into cheaper computation or merely into more expensive capability.

The more useful benchmark is biological. The human brain performs on the order of 101610^{16} to 101810^{18} operations per second while consuming roughly 20 watts—an energy cost per operation on the order of 101510^{-15} to 101710^{-17} joules. A modern GPU performs 101410^{14} to 101510^{15} floating-point operations per second while consuming 300 to 700 watts—an energy cost per operation on the order of 101210^{-12} to 101110^{-11} joules. The comparison is imprecise; synaptic events and floating-point multiplications are different physical processes. But the order-of-magnitude gap is real: evolution has achieved efficiencies that silicon has not approached, and the brain proves that computation at 101510^{-15} joules per operation is physically realizable.

A millionfold gap sounds impossible to close, but it represents only about twenty doublings of efficiency. The historical trajectory delivered roughly one doubling every eighteen months for half a century—a pace that, before it slowed, would have closed the gap in roughly three decades.2 The pace has slowed as transistors approach atomic scales and as the dominant energy costs have shifted from switching logic to moving data—shuttling bits between memory and processor, keeping caches coherent, driving signals across chips and boards. But the headroom exists. There is no known physical barrier preventing silicon or its successors from eventually approaching brain-level efficiency.

The economic question is whether this headroom translates into abundance.

It does not—at least not automatically. Headroom describes the distance between current practice and physical possibility. It does not describe the relationship between efficiency gains and demand growth. If efficiency improves by a factor of two and demand grows by a factor of ten, total energy consumption rises fivefold even as cost per operation falls.

The scaling laws that have driven capability improvement in foundation models are empirical regularities, not physical necessities. But so far they have shown no sign of saturating. Larger models, trained on more data with more compute, continue to outperform smaller ones across a wide range of tasks. The capability threshold for competitive relevance keeps rising—last year’s frontier model is this year’s commodity. Each generation of models consumes the efficiency gains of the previous generation and demands more. GPT-3 required approximately 1,300 MWh to train;3 subsequent generations have reportedly required an order of magnitude more, and the next generation will require more still. The cost per token falls; the number of tokens required to reach the frontier rises faster.

This dynamic has a name in other contexts: Jevons paradox. When efficiency improvements reduce the cost of a resource, consumption often increases enough to offset the savings. Coal consumption did not fall when steam engines became more efficient; it rose, because more applications became economical. The same logic applies to computation. As inference becomes cheaper, more tasks become worth automating; as training becomes cheaper, more experiments become worth running; as models become more capable, more ambitious applications become feasible. The efficiency gains are real, but they are absorbed by demand rather than banked as savings.


The physical evidence is already visible. The International Energy Agency estimates that data centers and their transmission networks consumed approximately 460 terawatt-hours of electricity in 2022 and could exceed 1,000 terawatt-hours by 2026, driven in significant part by the growth of AI workloads.4 A thousand terawatt-hours is roughly the annual electricity consumption of Japan. This is not a projection of what might happen if current trends continue; it is a near-term forecast based on infrastructure already under construction or announced.

The physical constraints detailed in Part II—interconnection queues, transformer lead times, permitting battles—determine who can build capacity and how fast. The gap between announced capacity and deliverable capacity is measured in years. This is the constraint that determines who can scale and who cannot. The algorithms continue to improve. The chips continue to shrink. The cost per operation continues to fall. And total energy consumption continues to rise, because the frontier absorbs the gains faster than the gains accumulate. The room to run is real, but running requires power, and power requires physical infrastructure that cannot be conjured by software.


The point is not that computation will become expensive in absolute terms. It will likely become cheaper per operation for decades to come, as the historical trajectory suggests. The point is that the relationship between efficiency and abundance is not automatic. Information may be copied cheaply; the bits that encode a trained model can be duplicated at negligible cost. But computation is purchased at the price of irreversibility, and the price, while falling, is not zero. The copy is downstream of an original expenditure that cannot be avoided. Every query to a language model, every frame rendered by a video generator, every decision made by an autonomous system requires joules, and those joules must come from somewhere.

This creates a tension that will shape the economics of the emerging regime. On one hand, the declining cost of computation makes intelligence more abundant and more accessible. Tasks that once required human expertise can be performed by machines at a fraction of the cost, and the range of such tasks is expanding. On the other hand, the physical basis of that intelligence—the energy, the chips, the cooling, the infrastructure—remains scarce and subject to constraints that do not yield to Moore’s Law. The cloud is not ethereal; it is grounded in concrete and copper and silicon, and the companies that control that ground will have advantages that do not reduce to software.

The next section turns from efficiency to structure. It introduces a concept that makes the thermodynamic cost of intelligence explicit: the idea that a trained model is not merely a file to be copied but a structure whose value derives from the irreversible search process that produced it. The copy is cheap; the search was not.