ARTFEED — Contemporary Art Intelligence

OpenAI releases MRC network protocol to scale AI supercomputers

ai-technology · 2026-05-06

OpenAI has introduced the Multipath Reliable Connection (MRC) protocol via the Open Compute Project (OCP) to enhance the performance and reliability of GPU networking in extensive training clusters. Collaboratively developed over two years with AMD, Broadcom, Intel, Microsoft, and NVIDIA, MRC is a novel network protocol integrated into 800Gb/s network interfaces. It efficiently distributes data transfers across numerous paths, swiftly navigates around failures, and employs static source routing. Currently, MRC is utilized in all of OpenAI's major NVIDIA GB200 supercomputers, including those at an Oracle Cloud Infrastructure site in Abilene, Texas, and Microsoft's Fairwater supercomputers. This protocol has facilitated the training of various OpenAI models, such as ChatGPT and Codex. MRC builds on RDMA over Converged Ethernet (RoCE) and incorporates methodologies from the Ultra Ethernet Consortium (UEC). It supports multi-plane networks linking over 100,000 GPUs with just two switch tiers, minimizing power consumption and component requirements. During training, MRC managed several link flaps per minute without any discernible effect and permitted switch reboots without job interruptions. The specification is now accessible as an OCP contribution, accompanied by a co-authored paper titled "Resilient AI Supercomputer Networking using MRC and SRv6."

Key facts

  • OpenAI released MRC protocol through OCP on May 4, 2026.
  • MRC developed with AMD, Broadcom, Intel, Microsoft, and NVIDIA over two years.
  • MRC is built into 800Gb/s network interfaces.
  • MRC deployed on all of OpenAI's largest NVIDIA GB200 supercomputers.
  • MRC used to train multiple OpenAI models including for ChatGPT and Codex.
  • MRC enables multi-plane networks connecting over 100,000 GPUs with two switch tiers.
  • MRC handles multiple link flaps per minute without measurable impact on training.
  • MRC specification available as OCP contribution.

Entities

Institutions

  • OpenAI
  • AMD
  • Broadcom
  • Intel
  • Microsoft
  • NVIDIA
  • Open Compute Project (OCP)
  • Oracle Cloud Infrastructure (OCI)
  • Microsoft Azure
  • Arista
  • Ultra Ethernet Consortium (UEC)
  • InfiniBand Trade Association (IBTA)

Locations

  • Abilene
  • Texas
  • United States

Sources