Nvidia's most powerful AI chip to date, the GH200 Grace Hopper Superchip, is now in full production, Nvidia announced earlier this week. the GH200 Superchip is designed to power systems that run the most complex AI workloads, including training the next generation of generative AI models.
The new chip has a total bandwidth of 900 gigabits per second, seven times more than the standard PCIe Gen5 lanes used in today's most advanced accelerated computing systems. nvidia says the Superchip also consumes five times less power, enabling it to more efficiently handle those demanding AI and high-performance computing applications.
In particular, the Nvidia GH200 Superchip is expected to be used in generative AI workloads represented by OpenAI ChatGPT, the near-human ability of generative AI to generate new content from prompts that is now sweeping the tech industry.
Generative AI is rapidly transforming the enterprise, unlocking new opportunities and accelerating discovery in healthcare, finance, business services and many more industries," said Ian Buck, vice president of accelerated computing at Nvidia. With Grace Hopper Superchips in full production, global manufacturers will soon be able to provide enterprises with the acceleration infrastructure they need to build and deploy generative AI applications that employ their unique proprietary data."
One of the first systems to integrate GH200 Superchips will be Nvidia's own next-generation, large-memory AI supercomputer, the Nvidia DGX GH200. according to Nvidia, this new system uses the NVLink Switch System to combine 256 GH200 Superchips, enabling it to run as a single GPU, delivering up to 1 exaflops of performance (or 1 quintillion floating point operations per second) and 144 TB of shared memory.
That means it has nearly 500 times more memory and is also more powerful than Nvidia's previous-generation DGX A100 supercomputer, which was launched in 2020 and simply combined eight GPUs into a single chip.
The DGX GH200 AI supercomputer will also come with a complete full-stack of software for running AI and data analytics workloads, Nvidia said. For example, the system supports Nvidia Base Command software, which provides AI workflow management, cluster management, accelerated compute and storage libraries, as well as network infrastructure and system software. The system also supports Nvidia AI Enterprise, a software layer containing more than 100 AI frameworks, pre-trained models and development tools for streamlining the production of generative AI, computer vision, speech AI and other types of models.
Constellation Research analyst Holger Mueller said Nvidia has effectively merged two truly reliable products into one by converging Grace and Hopper architectures with NVLink. The result, he said, "is higher performance and capacity, as well as a simplified infrastructure for building AI-driven applications that allows users to see and benefit from so many GPUs and their capabilities as one logical GPU."
Good things happen when you combine two good things in the right way, and that's the case with Nvidia. the Grace and Hopper chip architectures combined with NVLink not only bring higher performance and capacity, but also simplification for building AI-enabled next-generation applications because of treating all of these GPUs as one logical GPU. "
The first customers to adopt the new DGX GH200 AI supercomputer include Google Cloud, Meta Platforms and Microsoft, in addition to the DGX GH200 design that Nvidia will make available as a blueprint for cloud service providers who want to customize it for their own infrastructure.
Girish Bablani, corporate vice president of Azure Infrastructure at Microsoft, said, "Traditionally, training large AI models has been a resource- and time-intensive task, and the potential of the DGX GH200 to handle terabytes of data sets will enable developers to conduct advanced research at a much larger scale and at a much faster pace."
Nvidia also said it will build the DGX GH200-based AI supercomputer "Nvidia Helios" for its own internal R&D team, which will combine four DGX GH200 systems interconnected using Nvidia Quantum-2 Infiniband networking technology. By the time it goes live at the end of this year, the Helios system will contain a total of 1024 GH200 superchips.
Finally, Nvidia's server partners are planning to build their own systems based on the new GH200 Superchip, and among the first systems to launch is Quanta Computer's S74G-2U, which will be available later this year.
Nvidia said server partners have adopted the new Nvidia MGX server specification, which was also announced on Monday. MGX is a modular reference architecture that allows partners to quickly and easily build more than 100 versions of servers based on its latest silicon architecture for a wide range of AI, high-performance computing and other types of workloads. By using NGX, server manufacturers can expect to reduce development costs by as much as three-quarters and cut development time by two-thirds, to about six months.