New World Record for DPU Performance Set by NVIDIA BlueField

Data centers require incredibly rapid storage access, and as per the latest report by NVIDIA, Its BlueField-2 DPU is the fastest available.

NVIDIA’s recent testing revealed that two BlueField-2 data processing units achieved 41.5 million input/output operations per second (IOPS) – more than four times the IOPS of any other DPU.

Using conventional networking protocols and open-source software, the BlueField-2 DPU achieved record-breaking performance. It obtained greater than 5 million 4KB IOPS and more than 20 million 512B IOPS for NVMe over Fabrics (NVMe-oF), a standard means of accessing storage media, using TCP networking, one of the fundamental internet protocols.

It enables even better storage performance leveraging the popular RoCE network transport option to boost AI, big data, and high-performance computing applications.

BlueField outperformed both as an initiator and a target in testing, simulating real-world storage setups with different storage software libraries and workloads. InfiniBand, the recommended networking architecture for many HPC and AI applications, is also supported by BlueField.

Methodology of Testing

BlueField’s 41.5 million IOPS is more than four times the previous world record of 10 million IOPS obtained utilizing proprietary storage options. This performance was obtained by connecting two fast Hewlett Packard Enterprise Proliant DL380 Gen 10 Plus servers, one serving as the application server (storage initiator) and the other as the storage system (storage target).

Each server was outfitted with two Intel “Ice Lake” Xeon Platinum 8380 CPUs running at 2.3GHz, resulting in 160 hyperthreaded cores per server and 512GB of DRAM, 120MB of L3 cache (60MB per socket), and a PCIe Gen4 interface.

To accelerate networking and NVMe-oF, each server was outfitted with two NVIDIA BlueField-2 P-series DPU cards, each with two 100Gb Ethernet network ports, for a total of four network ports and 400Gb/s wire bandwidth between initiator and target, which were linked together using NVIDIA LinkX 100GbE Direct-Attach Copper (DAC) passive cables. Red Hat Enterprise Linux (RHEL) was installed on both systems.

Both SPDK and the regular upstream Linux kernel target were evaluated for storage system software using the default kernel 4.18 and one of the most recent kernels, 5.15. Three storage initiators were tested: SPDK, the regular kernel storage initiator, and the SPDK FIO plugin. FIO and SPDK were used for workload creation and measurement. I/O sizes were tested using 4KB and 512B, common medium and small storage I/O sizes, respectively.

The NVMe-oF storage protocol was tested at the network transport layer with both TCP and RoCE. Each setup was evaluated with total bidirectional network usage with 100% read, 100% write, and 50/50 read/write workloads.

The study also found the following BlueField DPU performance characteristics:

  • Testing with smaller 512B I/O sizes yielded better IOPS but lower-than-line-rate throughput, whereas testing with 4KB I/O sizes yielded higher throughput but lower IOPS statistics.
  • 100% read and 100% write workloads yielded comparable IOPS and throughput. However, 50/50 mixed read/write workloads delivered more extraordinary performance by simultaneously leveraging both directions of the network connection.
  • Using SPDK resulted in faster performance than kernel-space software but at the expense of increased server CPU use, which is expected given that SPDK operates in user space with continual polling.
  • Because of frequent storage enhancements introduced by the Linux community, the current Linux 5.15 kernel performed better than the 4.18 kernel.

DPU Storage Performance that Sets Records Enables Storage Performance with Security

In addition to quick storage access, BlueField enables hardware-accelerated encryption and decryption of both Ethernet storage traffic and storage media, preventing data theft or exfiltration. It offloads IPsec at up to 100Gb/s (data on the wire) and 256-bit AES-XTS at up to 200Gb/s (data at rest), lowering the risk of data theft if an adversary has tapped the storage network or if physical storage devices are stolen, sold, or disposed of incorrectly.

Customers and leading security software vendors are leveraging BlueField’s recently updated NVIDIA DOCA framework to run cybersecurity applications – such as a distributed firewall or security groups with micro-segmentation – on the DPU to improve application and network security for compute servers, lowering the risk of unauthorized access or data modifications on storage attached to those servers.

Click here, for a further read.