Thank you for tuning in to week 210 of the Lindahl Letter publication. A new edition arrives every Friday. This week the topic under consideration for the Lindahl Letter is, “AI Is Burning Through Graphics Cards.”
Generational wealth is being invested into data centers for AI. It’s so prevalent that you hear about it on the nightly news and municipalities are dealing with the power demands. The clock is ticking on graphics cards being used for AI inference. The current generation of GPUs was never designed to run around the clock under inference loads. These chips were originally built for bursts of rendering, not continuous model execution at scale. What we are seeing now is an industry trying to stretch gaming hardware into a role it was never meant to fill. The result is heat, power consumption, and a ticking clock based on the inevitable wear.
Each graphics card has a limited operational lifespan. These are not like bricks being used to build a house; they are just expensive computer hardware. The more intensive the workloads, the shorter that lifespan becomes. Fans fail, thermal paste dries out, and the silicon itself begins to degrade. Inference tasks, particularly when stacked across large fleets of GPUs, magnify this effect. The relentless pace of AI workloads accelerates the failure curve, turning once-premium cards into temporary consumables. I’m actually really curious what is going to happen to all of them at the end of this cycle. A secondary market does exist for these used devices and companies like Iron Mountain will help data centers with secure disposal.
By most reasonable estimates, there are now between 3.5 and 4.5 million NVIDIA data-center GPUs actively deployed in production environments. Hyperscalers such as Meta, Microsoft, and Google each operate hundreds of thousands of units, while smaller data centers fill out the rest of the global total. Each GPU represents a remarkable amount of compute density, but also a constant thermal and economic liability. Even with optimized cooling, sustained inference loads drive high thermal stress and power draw that shorten component life. These systems were never meant to run 24 hours a day, 365 days a year.
Under heavy duty cycles, many GPUs experience significant degradation within one to three years of continuous operation. The warranties often match that window, which reflects a design expectation rather than coincidence. Silicon aging and persistent thermal cycling all take their toll. Even when the hardware technically survives longer, it becomes economically obsolete as new architectures quickly double efficiency and throughput. The pace of improvement ensures that by 2027 or 2028, most of today’s fleet will either be retired, resold, or relegated to low-priority inference tasks. Right now TSMC would have to make the chips to replenish this fleet of GPUs which would be outrageously expensive. Both NVIDIA and TSMC manufacturing teams could be looking at a huge impending need for production or a shift to a new type of technology.
That replacement cycle has massive implications. The cost of refreshing millions of GPUs every few years is enormous, and the environmental impact of manufacturing and disposing of that much silicon is even harder to ignore. As AI inference continues to scale, this churn becomes unsustainable. Companies are already exploring purpose-built accelerators, ASICs, and FPGAs that can deliver better efficiency and longer service life. These designs aim to handle continuous inference without the same thermal or aging limitations that plague graphics cards.
Sustainability will define the next phase of AI infrastructure. The transition away from general-purpose GPUs is underway, but what comes after silicon remains uncertain. Research into photonic computing, quantum processors, and neuromorphic architectures offers glimpses of what a post-GPU world might look like. Each of these alternatives seeks to break free from the limits of traditional chips while extending useful lifespans. The next leap in AI hardware will not be measured by sheer speed, but by how well it can endure the relentless demands of inference at scale.
What’s next for the Lindahl Letter? New editions arrive every Friday. If you are still listening at this point and enjoyed this content, then please take a moment and share it with a friend. If you are new to the Lindahl Letter, then please consider subscribing. Make sure to stay curious, stay informed, and enjoy the week ahead!
Links I’m sharing this week!










