Google’s announcement of the availability of its new Cloud TPU v5p AI accelerator provides further evidence of the growing importance that leading cloud service providers are placing on developing their own custom silicon optimized for data center specific workloads.
It follows the recent launches by Microsoft of the Azure Maia and Cobalt processors and AWS of the Graviton4 and Trainium2 AI processors, as well as Google’s Cloud TPU v5e earlier this year.
According to Google, the Cloud TPU v5p is the company’s most powerful TPU (Tensor Processing Unit) yet. It delivers double the FLOPS (floating-point operations per second) and three times more HBM (high-bandwidth memory) compared to the TPU v4 and can train large LLM models 2.8X faster than it.
Google has positioned the Cloud TPU v5p as a core element of the new AI Hypercomputer architecture that the company has designed to meet the exponential performance and scaling demands of generative AI. In contrast to traditional methods that handle AI workloads with fragmented, component-level improvements, the AI Hypercomputer uses a holistic approach that integrates performance-optimized hardware, open software, leading ML frameworks, and versatile consumption models to enhance efficiency.
The introduction of the Cloud TPU v5p is the latest milestone in Google’s long-term development of in-house silicon platforms, which began in 2015. One of the key benefits of custom chip designs is that they give cloud service providers like Google greater control over costs while reducing their dependence on merchant silicon suppliers for critical components. Others include enhanced performance, reduced power consumption, and the ability to offer customers highly optimized solutions for specific tasks and workloads.