InfiniBand is a high-speed interconnect technology that enables fast and efficient communication between servers, storage devices, and other computing systems. Unlike Ethernet, a popular networking technology for local area networks (LANs), InfiniBand is explicitly designed to connect servers and storage clusters in high-performance computing (HPC) environments. InfiniBand uses a two-layer architecture that separates the physical and data link layers from the network layer. The physical layer uses high-bandwidth serial links to provide direct point-to-point connectivity between devices. In contrast, the data link layer handles the transmission and reception of data packets between devices. The network layer provides the critical features of InfiniBand, including virtualization, quality of service (QoS), and remote direct memory access (RDMA). These features make InfiniBand a powerful tool for HPC workloads that require low latency and high bandwidth.
InfiniBand has been widely adopted by the HPC community, as it powers some of the world’s fastest supercomputers and AI (artificial intelligence) systems. According to the latest TOP500 list of supercomputers, InfiniBand connects 141 systems, including two of the top five systems: Fugaku in Japan and Summit in the US. InfiniBand also supports some of the most demanding AI workloads, such as large language models, deep learning, and computer vision. For example, Microsoft uses InfiniBand to speed up the training and inference of its Turing Natural Language Generation model, which has 17 billion parameters.
InfiniBand is not only a high-performance interconnect, but also a platform for innovation and advancement. NVIDIA, which acquired Mellanox Technologies in 2020, is the leading provider of InfiniBand solutions, including adapters, switches, routers, gateways, cables, transceivers, and data processing units (DPUs).
NVIDIA has been developing new technologies and capabilities that enhance InfiniBand’s performance and functionality. For instance, NVIDIA Quantum-2 is the next generation of InfiniBand networking platform, which offers 400 Gb/s bandwidth per port, 64 Tb/s switch capacity, and advanced In-Network Computing features such as SHARP (Scalable Hierarchical Aggregation and Reduction Protocol). SHARP offloads collective communication operations to the switch network, reducing the amount of data traversing the network and increasing data center efficiency. NVIDIA also offers BlueField DPUs, which combine powerful computing, high-speed networking, and extensive programmability to deliver software-defined, hardware-accelerated solutions for the most demanding workloads.
InfiniBand is transforming the AI landscape by enabling faster, smarter, and more scalable computing. As AI applications become more complex and data-intensive, InfiniBand provides the extreme performance, broad accessibility, and strong security needed by cloud computing providers and supercomputing centers. InfiniBand also opens new possibilities for technology development, such as quantum computing, converged workflows for HPC and AI, and new interfaces and connectors. InfiniBand is not only a high-speed interconnect, but also a future-proof platform for innovation and discovery.