The GPU is at the centre of the
GenAI universe
Generative AI is not possible without the GPU. It is the muscle behind training large language and multi-modal models. Once models are trained and moved to deployment (inferencing) a different configuration of compute infrastructure can be used, which includes the CPU. However, GPUs are still very much at the heart of all stages of Generative AI. If you'd like a simple overview on the difference between a GPU and CPU, please scroll down and find the NVIDIA mythbuster video from 2009.
GenAI requires a lot of GPUs. Typically they are installed in machines of 8 GPUs each and these machines are connected together to form pods, islands, super-clusters of 256, 512, 1,024, 2,048 GPUs and upwards. This is so they can process PetaBytes of data and create the foundation models behind applications such as ChatGPT (OpenAI/Microsoft), Gemini (Google), Llama 3 (Meta) and many others. GPUs need homes, energy, technical expertise, must be compliant and used ethically. The eco-system surrounding the GenAI GPU is extensive, complex and needs to be fully understood by all those who wish to adopt this technology.
The surrounding eco-system can be categorised into four key areas, as shown above:
-
Ethics: How the technology is used and what for what purpose
-
Energy: The type and efficiency of power used
-
Location: Geographically and type of data centre
-
Technology: The configuration, networking, software, cooling and expertise
Key sub-areas emerge where each of the main categories overlap.​ This is just the tip of the iceberg.