AWS Launches EC2 P4d Instance Powered By NVIDIA A100 Tensor Core GPU For High-Performance Computing

0
808
Source: https://blogs.nvidia.com/blog/2020/11/02/nvidia-a100-launches-on-aws/

Amazon Web Services Inc. made its next-generation graphics processing unit-based compute instances (for high-performance computing and ML workloads) available to all its customers.

The Amazon EC2 P4d instances are backed by Nvidia Corp.’s newest and most powerful A100 Tensor Core GPU. The Amazon EC2 P4d instances are designed for advanced cloud applications requiring massive amounts of computing power like natural language processing, seismic analysis, genomics research, object detection, and classification.

Advertisement

They’re incredibly scalable, too. According to the Amazon Chief Evangelist, the EC2 P4d instances provide up to 400 Gb/s instance networking. At the same time, it supports both Elastic Fabric Adapter and Nvidia GPU Direct remote direct memory access to effectively make it possible to scale-out of HPC workloads and multinode ML training.

About ten years ago, Amazon Web Services’ debuted first GPU instance with the NVIDIA M2050. Since then, AWS has continuously updated its stable of cloud GPU instances, which includes the K80 (p2), K520 (g3), V100 (p3/p3dn), M60 (g4), and T4 (g4).

The new P4d instance has made it possible for AWS to pave the way for another bold decade of accelerated computing powered with the latest NVIDIA A100 Tensor Core GPU.

https://aws.amazon.com/ec2/instance-types/p4/

To provide the customers with enough computing power for the demanding workloads, AWS deploys the P4d instances in hyper-scale EC2 UltraClusters. Each of the UltraClusters has more than four thousand A100 GPUs, supported by low-latency storage, petabit-scale nonblocking networking infrastructure, and high throughput. EC2 UltraClusters are the supercomputers in the cloud provided to data scientists, researchers, and everyday developers by Amazon. 

Providing the AWS’s highest performance, the P4d instance is the most cost-effective GPU-based platform for ML training and high-performance computing applications. ML models’ training time can be reduced three times with FP16 and up to six times with TF32 compared to the pre-existing FP32 precision. Exceptional inference performance is another advantage of this update.

According to Amazon, the P4d instances are now available in the US West (Oregon) regions and AWS US East (N.Virginia). This service can be paid for on-demand, as part of AWS Savings Plans, as Spot Instances, or as Reserved Instances. 

NVIDIA Source: https://blogs.nvidia.com/blog/2020/11/02/nvidia-a100-launches-on-aws/

AWS Source: https://aws.amazon.com/blogs/machine-learning/aws-to-offer-nvidia-a100-tensor-core-gpu-based-amazon-ec2-instances/

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.