How AI Datacenters Eat the World

[00:00:00] Speaker 1: There's a seismic shift happening in the datacenter industry. A shift that's so fast and of such proportion, it's not only changing datacenters, but entire industries. And you don't have to take my word for it, let me show you. This is Temple, a town about one hour outside of Austin, Texas. But we are not here for the city and its historical Santa Fe Railroad Depot. We are here to look at a random field in the industrial area of town. To be precise, right now you are looking at pictures back from August of 2021. Fast forward one year to August of 2022 and the entire field is gone, replaced by a giant construction site. It's clear that Meta, the company who bought the land, has big plans. And they mean business. Just four months later, now in December of 2022, we can see that the construction of a datacenter is well underway. But then something strange happens. Another five months later, in April of 2023, our curious field in Temple looks like this. All the previous construction gone, raced to the ground. Meta just deleted their entire datacenter halfway through its construction. An estimated $70 million just gone. Wasted, it seems. But why? Was there a problem during construction? Or maybe they lacked the proper permits to continue? The truth is, none of that has anything to do with it. In this video, we will not only figure out why Meta made such a radical decision, but we will take a look at how a modern datacenter works and explore why the entire industry is changing so rapidly. Because if this trend continues, AI datacenters will start to eat the world. So, what happened in Temple? In order to understand what's going on in our field in Texas, and the entire datacenter industry in general, we have to understand how a datacenter works. Because a modern datacenter is much more than just a shed with some computers inside. But for a long time, that's basically what a datacenter was. Early datacenters were often located inside business properties, where a few rooms, for example in the basement, were converted to server rooms housing the IT equipment. With the rise of the internet, datacenters became more and more important, and they grew in size. Gone were the days of being an afterthought in a basement somewhere. Datacenters became massive construction projects, and fueled the rise of the web 2.0. They basically turned into the factories of internet companies, such as Google, Microsoft, Meta, or Amazon. Their primary function was to host and distribute data, thus the name, datacenter. Right now you're watching a YouTube video, which means it's stored inside a Google datacenter somewhere. And that somewhere is hopefully not too far from you, because you want fast and uninterrupted access. When it comes to content like YouTube, Netflix, or basically anything else in the cloud that you want to access, a datacenter not only has to provide a massive amount of data storage, but also a lot of network bandwidth and good latency. And that's where location comes into play. Datacenters, as we know them, are more location-sensitive than you might think. A great example for this is communities like Ashburn or Sterling, which are located right next to the Dallas International Airport in Washington, D.C. This small area contains a lot of datacenters because it combines a prime location, close to major population centers, with massive networking infrastructure that grew each time a new datacenter was built there. It's a perfect networking hub. If you are providing cloud services or are streaming content to your customers, that's where you want your datacenter to be located. The bigger the fiber network and the closer to your customer's physical location, the better for a traditional datacenter. But all of this is changing with AI, which is difficult to see at first, because from the outside, an AI datacenter looks very much like a traditional one. It's a large, mostly flat, storage-type building with power and cooling infrastructure surrounding it. But that's quite literally where the similarities end. If all you take away from this video is that AI datacenters are nothing like traditional datacenters, it's already a win. Calling it AI datacenter might even be a bit misleading, but it has become the established name. In my opinion, a more fitting name would be AI supercomputer, because that's what it actually is. Let me explain. From a high-level overview, a datacenter has four main components. Compute, connectivity, cooling, and power. If we use these four areas to compare a traditional to an AI datacenter, the difference become quickly apparent, which brings us one step closer to solving our Temple Texas mystery. Let's start with connectivity, because we just talked about how important location is for a traditional datacenter. But location literally doesn't matter for an AI datacenter, at least not in the way it does for a traditional one. There are two things an AI datacenter can be used for. First, it can train a large language model, which is simply called training. And second, it can use that pre-trained model to generate output, which is called running inference. A training cluster is more or less a closed system. It literally doesn't matter where you place it, at least not in the sense of being close to customers, because there are no customers accessing data. It still has networking, though, and there are efforts to connect large training facilities with each other over massive fiber lines, with the goal of conducting large-scale training runs across multiple AI datacenters. But that's not the same time of network access and cross-provider routing a traditional datacenter requires. But what about inference? If you are asking ChatGPT a question, you directly communicate with the datacenter that runs the inference. While that is true, inference also doesn't require the same networking as a traditional datacenter because it's not latency-sensitive. The compute part, basically calculating the answer, can take multiple seconds. Even if you add 500 milliseconds of latency on top, which is a lot, it doesn't change the experience. A chatbot is not a latency-sensitive application as long as it's limited by compute. It also doesn't require a lot of bandwidth. Even considering new applications like image and video generation, the compute times still outweigh any network connection when it comes to response time. This might change in the future once AI becomes more responsive. But for now, neither training nor inference have strong latency or bandwidth requirements. At least not consumer-facing. Netflix streaming thousands and thousands of 4K movies at the same time is in a whole different ballpark. Just like a video call or the video game has much tighter latency requirements. For AI datacenters, it's not an important factor. And the differences only get bigger when we look at compute. As I've said, AI datacenters are actually more like supercomputers. Their only goal is to deliver as much computational performance as efficiently as possible. And in order to increase compute efficiency for AI workloads, you have to increase density. Which starts at a chip level. If we look at the number one provider of AI compute, NVIDIA, we can see that ever since Volta, NVIDIA's first Tensor Core GPU, the performance and power consumption of each GPU generation has since skyrocketed. While Volta had an almost 10 TDP at only 250 watts, Ampere, its successor, raised it to 400 watts. Next, Hopper increased the TDP to 700 watts and NVIDIA's newest generation, Blackwell, is reaching 1,000 watts for a single GPU. A GB200 super chip, which combines two Blackwell GPUs with an NVIDIA Grace CPU, has a whopping 2,700 watt TDP for a single board. And this trend will continue. NVIDIA already announced GPUs that consist of up to four radical-sized chips. That's twice as much silicon as Blackwell. And even with the increased efficiency of more advanced process nodes in mind, the first 2,000 watt GPU isn't too far away. The compute density is massively increasing at the chip level. But it doesn't stop there. Not only is each new chip offering much-increased compute performance, the numbers of GPUs in a single server rack is increasing at the same time. When you are building a modern AI data center, you have to build for efficiency. Every watt not spent on actual compute is wasted. And while optical interconnects х are great over long distances and honestly, there is no other option over a certain distance, they need optical transceivers and retimers which require a lot of power. For that reason, you want to use as much copper as possible. NVIDIA's TV200 NVL72 compute rack contains over 5,000 wires and 2 miles of copper. If NVIDIA would have used optics instead, it would have consumed 20,000 watts more than the current copper-based NVLink solution. But copper is really the only viable at rack scale. Even within a single data center, you have to switch to optics at some point. That's why you want as many GPUs in a single rack as possible, so you can connect as many of them using copper. Compute density is the holy grail when it comes to AI data centers. You want your GPUs to use as much silicon as possible, have as many GPUs on a single board as possible, and as many of these in a single rack as possible. That's why the power requirements for a single rack are continuing to grow. The best way to see just how different traditional and AI data centers are is to look at how much compute and as a direct effect how much energy a single rack in each of these data center types is using. If you pick a random server in a traditional data center outside of hyperscalers like Google Meta and so on, you'd be hard-pressed to find a rack that uses more than 10 kW. The typical rack power consumption is in the range of 3 to maybe 7 kW. Everything above 10 kW per rack is already considered high-performance for a traditional data center. And while hyperscalers are building racks in the 15 to 20 kW range, even that doesn't compare to racks used for AI compute. The NVIDIA GB200 NVL72 we just talked about, which is NVIDIA's fastest rack-size solution, has four power shelves that provide 33 kW each. That's a total of 132 kW for a single rack, 10 times what would be considered high-performance set up in a traditional data center, and 30 to 40 times the rack power of a standard run-of-the-mill server. We aren't talking about small differences here. It's night and day. I wasn't kidding when I said AI data center is a somewhat misleading name, because these numbers even trump supercomputers. If it would be possible, AI hyperscalers would build a gigawatt rack, because density is king. As you can imagine, this massive increase in compute and power density also directly affects cooling. Traditional data centers with a lower critical IT power need smaller cooling solutions. Makes sense. Until recently, almost all data centers were air-cooled. But this is changing. Data centers that run AI compute are quickly transitioning to liquid cooling. There are three specific reasons for this, all related to density. This is AMD's MI300X AI accelerator. One GPU ready to be installed in a server plate. But about 90% of its volume is taken up by a massive heatsink. The small PCB below all that metal is the actual GPU. Unlike consumer GPUs, server GPUs don't have individual fans. They just come with massive heatsinks and are cooled by industrial, high-performance fans that cool every component on a single blade. Switching to liquid cooling drastically reduces the physical footprint of each GPU because a liquid cooler is much more compact than the massive heatsinks required for air cooling. It quite literally increases density because it allows you to pack more hardware into a single server blade and rack because less space is wasted on heatsinks. Of course, liquid cooling requires a lot of additional infrastructure in and around the data center. But that's outside of the rack scale where density doesn't matter anymore. The second aspect is cooling performance. Liquid cooling can absorb about 4,000 times more energy per unit of volume than air. If you have to remove a lot of heat because you have to cool lots of GPUs in a very dense setup, it's the only option. Super high-density designs are only feasible with liquid cooling. And while there are some Blackwell implementations that still use air cooling, next-gen AI accelerators will almost exclusively use liquid cooling. Google, for example, switched to liquid cooling for their in-house high-performance GPUs a long time ago. But there's a third, sometimes overlooked aspect to liquid cooling. Running silicon at lower temperatures not only increases its lifespan, but also increases energy efficiency. If you just run a single GPU, there's not much to it. But if you run 100,000 GPUs, the savings add up. And that energy can be used for more important things, like more compute. Of course, liquid cooling is something you have to plan from the very beginning. An air-cooled data center is designed very differently to a liquid-cooled one. It's not your average desktop PC where you can just upgrade an air cooler to a water cooler. It completely changes the layout of the data center. You need to include water pipes from the rack to the building level and install massive cooling towers. But not only is it worth it when you strive for the highest amount of compute density, there's simply no viable alternative if you want to stay competitive. Now that we've covered connectivity, compute, and cooling, let's talk about power. But not at the rack level, we've already discussed that. I'm talking about power at the level of the entire facility, which has become the number one denomination when we talk about data center size. It's not the actual size of the building we're talking about. It's the total power capacity of the data center, also called critical IT power. Traditional retail data centers often provide less than 10 megawatts of critical IT power. Even the larger wholesale data centers, like the huddle of data centers around the Dallas airport in DC, are only in the 10 to 30 megawatt range. Modern hyperscaler data centers from the likes of Microsoft, Google, Amazon, and Meta, and I'm still talking about traditional data centers that actually host data, can reach 40 to 100 megawatts of critical IT power. But they all pale in comparison to the critical IT power of AI data centers. There are multiple AI data centers with critical IT power of over 200 megawatts. Microsoft, for example, operates two 300 megawatts AI data center for OpenAI. And this is just the beginning. AI campuses with one gigawatt of critical IT power are already under construction. All of this is further amplified by the fact that while a traditional data center has fluctuating power demand based on usage patterns and rarely run at full power, AI data centers are more or less constantly running at close to full load. They not only have this massive critical IT power, they actually use it. With the power requirements of AI data centers, we are talking about direct access to major high-voltage power lines. And because server racks don't run on high voltage, which is over 100 kV, you need transformers to step down the voltage. First to medium voltage and then to low voltage, which for data centers is usually 415 volts. With a power consumption that rivals large cities, these massive AI data centers require a lot of transformers. So many, in fact, that the order books already have a backlog. Transformers, which previously were mostly bought by governments to serve large cities and industrial centers are suddenly in high demand because of AI. Another interesting difference between traditional and AI data centers is the idea of backup power. For a traditional data center, loss of power is a major critical failure point. That's why they need a system that ensures uninterrupted power supply. For the very short term, that means batteries, which have to bridge the time until the emergency generators come online. But AI data centers have such a massive power demand, they need a large amount of generators, which should only have to be bought, but also require the proper permits. This not only adds a lot of additional costs, but it takes time to set up. And since time to market is crucial in the AI race, AI data centers often have a very limited UPS system. In this case, the data center just stops working if the main power source fails. Which, funny enough, isn't that big of a deal for training runs, as the GPUs already introduce somewhat frequent failures. When the power goes out, you can simply continue the training run when it comes back. And I'm really just scratching the surface here. No matter if compute, connectivity, cooling or power, there's so much more depth to it. Most of what we just covered is based on the amazing data center anatomy series from SEMMA Analysis, which covers every aspect of a modern AI data center in great detail. If you want to know how the power stage of a massive AI data center really works, or what's required to make a data center ready for liquid cooling, I highly recommend that you check out the articles I've linked in the video description below. I've been a SEMMA Analysis subscriber for long before we started collaborations like this one, and it's definitely worth the money. But the secret is that like 80% of a SEMMA Analysis article is not behind a paywall. That's how they got me. For every topic I researched, I found a super detailed article from SEMMA Analysis that actually explained how the industry works in a digestible way. Even without a subscription, it's a top-tier resource. And with a subscription, it only gets better. I mean it in the sincerest way. If you are even somewhat interested in the SEMMA space and AI data centers, check out the links and read the articles. It's so worth it. Now that we've learned all about AI data centers and how they are nothing like traditional data centers, let's get back to our mystery field in Temple, Texas. Why did Meta start building a new data center only to tear it down halfway through construction? As a quick refresher, construction started in mid-2022 and progressed until at least the end of 2022. But by April of 2023, the entire construction site was flattened. What could have happened during that time which led to Meta making such a radical move? The brainiacs among you might have already figured out that this timeline aligned almost perfectly with the release of ChatGPT in November of 2022. So, is that our answer? Did Meta start building a traditional data center and realized halfway through the construction that it was outdated? Well, kind of. But it's even more radical than that. The initial construction site was for Meta's tried-and-tested H-type data center. It's called that because the final shape of the data center looks like the letter H. If we actually take a closer look at the satellite images, we can see that the initial build would have looked like an H if it was finished. Many of these H-type data centers are used in a more traditional data center role, filled with CPUs and hard drives. But while it was designed for maximum energy efficiency, Meta's H-type data centers are already capable of running GPUs. Meta has a massive NVIDIA hopper-based AI cluster that combines 100,000 H-100 GPUs across multiple of the same H-type data centers. So Meta was already building an AI-capable campus in Temple. But it wasn't offering a high enough energy density to stay competitive. That's how fast the industry is moving. Even AI data centers that are built on a very fast timeline can become outdated during construction. The story has a happy end, at least for Meta. If you look at the most recent satellite images from 2025, we can see that our little field in Temple now houses not one, but two AI data centers. This is Meta's new high-density design, with each building providing about 85 MW for a combined critical IT power of 170 MW. The new design also has the added benefit of supporting liquid cooling, which makes the high-density layout possible in the first place. The older age design would have only supported a total of 60 MW. Too little in today's AI data center race. And it's only the beginning. This is Three Mile Island, a nuclear power plant located on the Sasquana River outside of Harrisburg, Pennsylvania. In March of 1979, the Three Mile Island power plant became infamous when its TMI-2 reactor had a critical failure and suffered a partial meltdown. To this day, still the most severe nuclear accident in United States history. The core of TMI-2 has been removed from the site and the second reactor, TMI-1, was shut down in 2019 because it was operating at a loss. The entire site has since been marked for decommission. But that changed. Last year, in 2024, Microsoft announced a deal with Constellation Energy, the owner of the site, to restart the still-working TMI-1 reactor. The nuclear power plant is expected to resume operation in 2027 with all energy going to Microsoft for the next 20 years. And I'm sure you already know what Microsoft needs all that power for to power the next generation of AI data centers. And if you think this is an extreme example, think again. Not far from Three Mile Island, only about two hours by car, is the Sasquana steam electric station, a nuclear power plant with about 2,500 megawatt output. In 2023, Talon Energy, the operator of the power plant, started to build a massive on-site data center, which was acquired by Amazon AWS in 2024 for about $650 million. And there's only one reason to place a data center right next to a nuclear power plant, to power the massive energy demand for AI. We can see similar moves happening across the entire industry. Meta not only rebuild the data center on our field in Temple, they are starting to place new AI data centers in tents because it reduces construction time. And aside from compute density, being fast is important in the AI race. Meta's top two AI locations are Prometheus in Ohio, an already existing AI cluster powered by gas turbines that's supposed to scale to over 1 gigawatt in the next year. But number one is the Hyperion supercluster, which is supposed to reach a truly impossible scale. By 2030, the site located in Louisiana is supposed to reach a combined critical IT power of 2 gigawatts with room to go to 5 gigawatts. For comparison, the country of Germany has an average power usage of about 60 gigawatts. CoreWeave, a hyperscaler, acquired and retrofitted an old crypto mine data center in Denton, Texas that was previously used to mine Bitcoin. If we take a look at satellite images, we can see a large cluster of buildings in this location. But one is not like the others. The center of the site is a massive gas-powered power plant that directly supplies power to the AI data centers. Elon Musk's XAI is the largest yet, with 150,000 NVIDIA GP200. Data centers are growing so quickly, they have to be fed with mobile generators because the main power sources take too long to get online. Everyone is scrambling as fast as they can. What we are seeing right now is only the beginning. 300 megawatt clusters might seem big in 2025, but the first gigawatt clusters will come online next year. Right now, 200,000 GPUs are a lot for an AI cluster, but they are already plans for a million GPUs. And these won't be the same Hopper or Blackwell generations as today. Those will be upcoming GPU generations with even higher TDP numbers. The simple fact is that it doesn't matter if you believe in AGI or not. All that matters is that the major players very clearly believe it's a race to AGI and whoever gets there first takes the entire cake. And because that cake is worth trillions of dollars, they are willing to do everything in their power to get there first. And that quite literally requires power. A lot of power. Google has announced that it's funding the construction of three advanced nuclear power plants. It won't be long until most major hyperscalers will be major players in the energy business, including owning and operating multiple nuclear power plants. Next-gen AI data centers not only have nothing in common with traditional data centers, soon individual data center campuses will surpass even the power demand of megacities and huge industrial parks. And the largest AI clusters are adding power demand which rivals that of industrial nations. It's not slowing down. The race for AGI is not only about compute, it's about power, both literally and figuratively. If this trend continues, AI will become the number one consumer of energy. There is so much going on in the data center industry, it's almost impossible to follow all developments. Unlike supercomputers that are eager to get listed in the top 500 to show off what they achieved, AI data centers are much more private. The AI race isn't happening out in the open, at least outside of VR announcements, if you don't know where to look. Hyperscalers don't want you or the competition to know how much actual compute they have, how much they will add over the next month and years and how competitive or rather how dense their AI data centers are and how much power all their data centers consume. But then how do we know about all these projects? How do we know about their critical IT power? How many GPUs they run? How efficient they are? And which power source they use? The answer is a combination of a large knowledge base, lots of high quality research and actually spending the money on high resolution satellite images. And I don't mean the kind on Google Earth that maybe gets a low res update every half year. I mean professional satellite images. I'm not describing a fantasy of mine here. That's actually what the brainiacs of the SEMA analysis data center team are doing. It might be a bit insane, but SEMA analysis is tracking over 5,000 data centers worldwide. And by tracking, I don't mean a simple Excel sheet with a name and an address. I'm talking about the true Sherlock Holmes stuff. The data center model not only tracks new construction projects, but also existing data centers. High res satellite images are analyzed in detail. And because the power stages, generators, and cooling infrastructure are visible, the data center team can actually create very detailed insights for each data center. Of course, that only works if you know what to look for. But when you do, and the SEMA analysis data center team certainly does, a data center is like an open book. I don't think there's anything that comes even close in terms of coverage and insights to what the AI data center model from SEMA analysis offers. If you are working in or with the industry and are interested in a highly detailed overview of the current AI data center market, you have to check out the data center model. Not only is it really cool, it delivers the most extensive insights available into the fast-paced race to AGI. ChatGPT was released in November of 2022, less than three years ago. Ever since, it feels like everything is speeding up. The race for AGI has created an insatiable demand for AI compute that only seems to accelerate. The data center industry is now more focused on building AI supercomputers than actual data centers. And with it comes a massive demand for power. That not only means more transformers, more generators and liquid cooling, but energy generation is more and more a focus point of hyperscalers and big tech companies. From the launch of ChatGPT to next year, AI compute will add an estimated 40 to 50 gigawatts of global power demand. These are numbers comparable to the average use of entire countries like France and Germany. And I know I'm repeating myself, but this is just the beginning. If this trend continues, it will just take a few more years before Google, Microsoft, Meta, Amazon and other hyperscalers will operate more nuclear power plants than most countries in the world and add AI data centers with critical IT power that surpasses those of most nations on a yearly basis. All of this is in hopes of being the first to achieve HEI. And if they get there, the AI power demand will surge even more. It truly starts to look like AI data centers are going to eat the world. The first bytes are already visible if you know where to look. Thank you again to the entire semi-analysis team and especially Jeremy who was very patient in answering all of my stupid questions. Go check out their amazing work and if you want to know when the next bytes are coming, the semi-analysis data center model is your best weather forecast. I hope you found this video interesting and see you in the next one. Oh, and subscribe if you want to see more videos like this one.

Related Transcripts from High Yield

Transcribe Any Video or Podcast — Free