Nvidia Corp. wants to tell the world it’s at the forefront of the generative artificial intelligence movement at a time when the hype over models such as ChatGPT reaches fever pitch.
In particular, Nvidia is especially focused on making sure that anyone can access the infrastructure and software needed to train generative AI models. At its GTC 2023 today, the chipmaker unveiled a new AI supercomputing service to enable precisely that. It’s called Nvidia DGX Cloud, and as the name suggests, it’s a cloud-hosted version of its popular DGX platform, which has become the enterprise standard for AI training.
In addition, the company announced a set of new cloud services under the banner of Nvidia AI Foundations, which can help companies use DGX Cloud to build, refine and operate customized large language and generative AI models trained on their own proprietary data, for unique, domain-specific tasks.
AI training rises to the cloud
With Nvidia DGX Cloud, users can access dedicated clusters of DGX AI supercomputing that are paired with Nvidia’s AI software. In other words, every enterprise now can access an AI supercomputer, eliminating the need to acquire, deploy and manage such systems on-premises.
Enterprises can now rent DGX Cloud clusters on a per-month basis, with prices for a single instance starting at $36,999 per month, and immediately scale up the development of large, multinode training workloads without needing to wait to access the required accelerated computing resources.
In his keynote this morning, Nvidia Chief Executive Jensen Huang proclaimed that “we are at the iPhone moment of AI” as startups race to build disruptive new products and business models focused on generative AI. “DGX Cloud gives customers instant access to Nvidia AI supercomputing in global-scale clouds,” he said.
Nvidia said each DGX Cloud instance features eight Nvidia H100 or A100 80-gigabyte Tensor Core GPUs, providing a total of 640 gigabytes of GPU memory per node. Meanwhile, Nvidia’s high-performance networking capabilities enable multiple instances to act as one enormous GPU to meet the performance requirements of any project, the company said.
Nvidia said it’s planning to work with a number of cloud infrastructure providers to offer the DGX Cloud service. Surprisingly, it has selected Oracle Corp. as its launch partner, saying its Oracle Cloud Infrastructure RDMA Supercluster provides a purpose-built RDMA framework with bare-metal compute and high-performance local and block storage that can scale superclusters to over 32,000 graphics processing units.
Microsoft Azure will also host the Nvidia DGX Cloud platform from the second quarter, and Google Cloud is also expected to offer the service “soon,” Nvidia said. Not mentioned was cloud leader Amazon Web Services Inc.
Early adopters include the American biopharmaceutical firm Amgen Inc., which said it has combined DGX Cloud with Nvidia’s BioNeMo large language model and Nvidia AI Enterprise software to accelerate drug discovery.
“With Nvidia DGX Cloud and Nvidia BioNeMo, our researchers are able to focus on deeper biology instead of having to deal with AI infrastructure and set up ML engineering,” said Peter Grandsard, executive director of research, biologics therapeutic discovery at Amgen’s Center for Research Acceleration. “The powerful computing and multinode capabilities of DGX Cloud have enabled us to achieve 3X faster training of protein LLMs with BioNeMo and up to 100x faster post training analysis with Nvidia Rapids relative to alternative platforms.”
While Nvidia AI Enterprise acts as the main software layer for Nvidia DGX Cloud training, another service called Nvidia Base Command can be used to manage and monitor individual workloads, making it simpler for users to match workloads with the resources each one requires.
The Foundations of generative AI
Nvidia AI Foundations includes the Nvidia NeMo language service and the Nvidia Picasso image, video and 3D service. The two services are designed to simplify the task of building generative AI applications for intelligent chat, content creation, digital simulations and more.
“Nvidia AI Foundations let enterprises customize foundation models with their own data to generate humanity’s most valuable resources, intelligence and creativity,” Huang said in his keynote.
The NeMo and Picasso services provide access to foundational generative AI models through simple application programming interfaces. There are six elements to each service, Nvidia said, including pretrained models, frameworks for data processing, vector databases and personalization, optimized inference engines, APIs and finally, support from Nvidia’s experts, who will be on hand to help enterprises fine-tune their models for specific use cases.
The NeMo service can be used to customize LLMs to make them more relevant for specific tasks by defining the area of focus and adding domain-specific data and functional skills. Nvidia provides LLMs of varying sizes, ranging from 8 billion to 530 billion parameters, and promised to keep them regularly updated with additional training data. As a result, enterprises have broad options for building generative AI applications that can be augmented with their own data, Nvidia said.
As for Picasso, it’s for building AI-powered image, video and 3D applications with advanced text-to-image, text-to-video and text-to-3D capabilities. Customers can use the service to train Nvidia’s Edify foundation models on their own datasets to build AI apps quickly that rely on natural text prompts to create and customize visual content, similar to OpenAI’s DALL-E. Nvidia cited dozens of potential use cases, including product design, digital twin development, storytelling and video game character creation.
The NeMo service is currently available in early access, with Picasso now in private preview. Enterprises are free to apply for access to each service today.