Google accompanied the recent launch of its Gemini AI models with the latest version of its flagship tensor processing unit (TPU) for AI training and inference, in what appears to be an attempt to take on Nvidia’s own market-leading GPUs.
TPU v5p – Google’s most powerful custom-designed AI accelerator – has been deployed to power the firm’s ‘AI Hypercomputer’. This is a supercomputing architecture that’s built specifically to run AI applications, rather than supercomputers which normally run scientific workloads, because TPUs are unsuited to this.
The latest version of its TPU has 8,960 chips per pod (which comprise the system), versus 4,096 in v4, and it’s four times as scalable in terms of total availability of FLOPs per pod. These new pods provide a throughput of 4,800Gbps. The new pods also have 95GB of high-bandwidth memory (HBM) versus 32GB HBM RAM in TPU v4.
Nvidia H100 vs Google TPU v5p: Which is faster?
Unlike Nvidia, which offers its GPUs out for other companies to purchase, Google’s custom-made TPUs remain in-house for use across its own products and services. Google’s TPUs have long been used to power its services including Gmail, YouTube and Android, and the latest version has also been used to train Gemini.
Google’s v5p TPUs are up to 2.8 times faster at training large language models than TPU v4, and offer 2.1-times value-for-money. Although the intermediary version, TPU v5e, released earlier this year, offers the most value for money of all three, it’s only up to 1.9-times faster than TPU v4, making TPU v5p the most powerful.
It’s even powerful enough to rival Nvidia’s widely in-demand H100 GPU, which is one of the best graphics cards out there for AI workloads. This component is four times faster at training workloads than Nvidia’s A100 GPU, according to the company’s own data.
Google’s TPU v4, meanwhile, is estimated to be between 1.2 and 1.7-times faster than the A100, according to research it published in April. Incredibly rough calculations would suggest the TPU v5p, therefore, is roughly between 3.4 and 4.8-times faster than the A100 – which makes it on par or superior to the H100, although more detailed benchmarking is needed before any conclusions can be drawn.