With the rapid rise of generative artificial intelligence, more CIOs and VPs of Infrastructure are looking to optimise their IT infrastructures to unlock the technology’s potential.
Today, 42% of IT professionals at large organisations report that they have actively deployed AI while an additional 40% are actively exploring using the technology. Yet as IT environments evolve at an unprecedented pace, many data centres are proving incapable of providing the foundation for organisations to develop and implement AI applications at scale.
This is because AI demands more resources than traditional computing, meaning the need for processing power has surged exponentially. Some estimates suggest that data centre power demand will grow 160% by 2030.
At present, data centres worldwide consume 1% to 2% of overall power, but this percentage is on course to rise to 3% to 4% - if not more - by the end of the decade. As a result, AI requires specialist data centre and physical infrastructure systems, including high-density racks and highly energy efficient cooling – things are currently beyond many organisations’ existing data centre understanding and capabilities.
At the same time, a third of organisations (33%) said limited AI skills, expertise or knowledge is hindering the successful adoption of AI in their business. More than one in fi ve (21%) said they lack the necessary tools or platforms for developing AI models. So, what can those enterprises do to address those challenges? The answer is that they must look beyond their existing data centre infrastructure.
They need an environment that brings together the necessary ingredients – compute, storage and networking power, data intelligence and talent – to develop and implement cutting-edge generative AI models. Today, they need AI factories, specifically designed to help organisations leverage the power of AI.
But what is an AI factory and how do they differ from more traditional manufacturing factories that we’re used to seeing? Put simply, an AI factory is a data centre that produces actionable AI. AI factories, which include servers with GPUs and high speed networking from Nvidia will become a significant portion of all data centres as AI takes even greater significance in the coming years.
“Car factories take atoms and mould them into cars. An AI factory takes data and moulds it into knowledge, or predictive knowledge,” explains Michael Schulman, senior corporate communications manager at Supermicro. “It’s a factory, but it’s not a factory many people are used to.”
AI factories versus traditional data centres
With CIOs taking the lead on their organisations’ ambitious digital transformation plans, it is important they understand the benefits that AI factories can deliver if they are to unlock next-gen business value through process automation and workflow optimisation. As part of this, they must develop holistic infrastructure strategies built on unifi ed systems and service level agreements (SLAs) to their constituents that deliver results while minimising down-time, keep energy intensive servers cool and high-performance hardware working optimally and sustainably.
“With an AI factory, you have to think a little bit differently than with an enterprise data centre,” says Schulman. “Servers are going to draw much more power than they ever did before – but many existing data centres are limited in the amount of power that they can get from their local utility, which affects what they can deliver to the servers, when scaling is needed.”
It’s not just about power demands. “Organisations must rethink their networking, because for AI training you need fast networking – hundreds to thousands of servers that need to communicate with each other. All that comes into play with liquid cooling,” says Schulman.
So how can AI factories help enterprises unlock gains on a hardware, server and rack and data centre level? Firstly, simply CIOs must think beyond the server level – instead Schulman says organisations must think at the rack level.
“The rack is the new server,” he says. “Rather than having ‘one of these’ and ‘one of these’ and trying to hook them up and figure it out themselves, it’s much more efficient to buy a rack of servers at a time and think of that as a unit. It’s a little different, but companies like Supermicro and others can do that efficiently now.”
Indeed, components created by Supermicro and Nvidia combine to create comprehensive AI solutions for businesses. Says Schulman: “It’s better for the customer because if you get a rack full of stuff that’s already been tested with the application, you just have to plug in the power, the network, the cooling system and you’re ready to go. It’s much more efficient and for time to production, you have it all tested by an experienced vendor, rather than trying to do this yourself.
“Again, you can buy a car that’s ready to go, or you can go buy the tires and the seats and everything else and hope it all works together.”
How CIOs can make the case for infrastructure investment
Every enterprise needs an AI champion. Someone who will make the case for investment and be able to lay out every aspect of the build - CIOs need to seize this role with both hands.
Liquid cooling technology
At a rack and data centre level, liquid- cooling technology is essential for optimising AI hardware performance and reducing total cost of ownership (TCO). Research suggests that more than a third of enterprises (38.3%) expect to employ some form of liquid cooling infrastructure in their data centres by 2026, up from just 20.1% as of early 2024.
That’s because air-cooling has its limits – it’s inefficient and incapable when servers are running at higher temperatures. Servers that are application optimised for AI, high-performance computing (HPC) and analytics require the latest in CPU and GPU technologies, which run hotter than previous generations.
Multiple CPUs and GPUs per server are needed for performance intensive computing, driving up the electricity demands for the server as well as at the rack level. AI factories and HPC centres need to be designed for servers to work constantly, 24×7×365. This reduces the TCO but does require consideration of cooling technologies.
Supermicro works closely with a number of its technology partners, and brings to market entire clusters powered by a number of Nvidia technologies.
“Liquid cooling is going to be required, because of the progression of CPUs and GPUs – they’re getting so hot, that it must be planned in advance. You can’t just turn up the air conditioning,” says Schulman.
The good news is that liquid cooling solutions can reduce OPEX spending by up to 40%, and allow data centres to run more efficiently with lower power usage effectiveness (PUE), enabling data centre operators to deploy the latest and highest performance CPUs and GPUs for AI workloads and HPC.
“Our simulations show the cost is extremely minor for a liquid-cooled data centre compared to an air-cooled data centre, based on various construction models. Then over time, there’s a significant savings in OPEX because there’s less money being sent to the public utility for electricity. And that’s reflected in PUE, in how efficient the data centre is. So, all of this needs to be thought about for your AI factory,” adds Schulman.
Creating an AI factory roadmap
So how should organisations be thinking about their approach to innovative IT infrastructure as innovation demands ever more energy and processing power? And what might an implementation and adoption roadmap for an AI factory look like? Some organisations might want to start small but have the infrastructure ready to scale up. CIOs must consider their SLAs – what are they promising users in terms of response times? Are you envisioning the best response time requiring multiple high-end servers, or could you go with fewer and save power?
These are some of the tradeoffs that businesses need to weigh up. The technology within AI factories can also help shape an organisation’s implementation roadmap. For example, by using liquid cooling, businesses use less power. By not paying as much for power, they can invest in more hardware within the same power budget, helping to maximise investment.
“Different organisations are going to want to go different ways once they start to figure out the economics; there are choices all around with this new era of AI,” says Schulman. Ultimately, says Schulman, organisations need to have a clear understanding of how AI can benefit their business, before thinking about the infrastructure to support those plans.
“This is not about running 80 billion model parameters like ChatGPT; it’s about how the enterprise is looking at AI to improve their business,” he says. “How do enterprises move their business forward and make more money using AI and what do they have to think about to get started? Then you can figure out the AI factory piece.”
For more information please visit supermicro.com/ai