Amazon Web Services Inc. today announced the preview of a new cloud instance dubbed Amazon EC2 P4de that will enable customers to train neural networks faster.
P4de also lends itself to high-performance computing applications, or applications that usually run on supercomputers. The instance can power the seismic analysis programs that researchers use to study earthquakes, computational fluid dynamics software and other demanding workloads.
Every P4de instance features 8 of Nvidia Corp.’s A100 graphics cards that each include 80 gigabytes of onboard memory. According to AWS, P4de enables customers to train AI models up to 60% faster than its earlier P4d instances at a 20% lower cost. Additionally, the instance can speed up high-performance computing workloads that process large datasets.
The A100 chip on which the P4de is based was introduced by Nvidia in 2020. The chip features more than 50 billion transistors that enable it to provide 20 times more performance than Nvidia’s previous-generation graphics card. Companies can use the A100 both to train AI models and to perform inference, or the task of running a neural network in production once training is complete.
The A100 features a range of features not included in its predecessor. There’s improved support for tensors, specialized data structures that neural networks use to store the information they process. Nvidia also added support for sparsity, an AI optimization method that improves neural networks’ performance by reducing the number of calculations necessary to generate insights.
AWS’ new P4de instances combine eight A100 chips with 96 virtual central processing units. Each instance also includes 1.1 terabytes of memory, as well as 8 terabytes of local NVMe flash storage. Local storage is the term for flash drives that are attached directly to a server and therefore provide lower latency than storage hardware located in a different part of the data center.
P4de deployments run on EC2 UltraClusters, specialized hardware environments in AWS data centers that place an emphasis on optimizing performance. Customers receive access to “petabit-scale non-blocking networking infrastructure,” according to AWS. EC2 UltraClusters also provide features designed to streamline data storage tasks.
P4de expands the already extensive lineup of compute instances in AWS’ cloud portfolio. The instance is becoming available in preview only days after AWS announced the general availability of its C7g instances, which are powered by the cloud giant’s internally-developed Graviton3 processor. Graviton3 provides up to 25% higher performance than the previous-generation Graviton2 processor.
AWS has also developed custom chips for AI use cases. The cloud giant’s AWS Inferentia processor, which is available as part of its EC2 Inf1 instances, is optimized for inference tasks and can perform up to 128 trillion operations per second. Last November, AWS previewed a second AI chip dubbed AWS Trainium that’s optimized for AI training.