- Broadcom claims that the best way to end Nvidia’s monopoly in GPU processors is to move away from InfiniBand’s proprietary approach and toward the open networking approach of ethernet technology.
- The Tomahawk 5, which is now on the market, replaces the Tomahawk 4, a 25.6 terabit per second chip from Broadcom.
Switch silicon maker Broadcom recently announced the launch of Tomahawk 5, the newest switch chip that can connect endpoints with a combined bandwidth of 51.2 terabits per second.
The launch is an outcome of a significant impasse that has been hovering for quite some time now. Experts in the field of computer networking have been discussing the need for a second network. The typical network, or LAN, is the one that links client computers to servers. The emergence of Artificial Intelligence (AI) has led to the creation of a network “behind” that network, referred to as a “scale-out” network, to run activities like deep learning programs that must be trained on tens of thousands of GPUs.
Nvidia, one of the most prominent vendors of the GPU chips running deep learning, is soon to become the dominant networking technology vendor to interconnect the chips. It uses the InfiniBand technology that was added when it acquired Mellanox in 2020. Some argue that the risk is that everything depends on a single company, there is no diversification, and there is no way to construct a data center where multiple chips can compete.
“What Nvidia is doing is saying, I can sell a GPU for a couple of thousand dollars, or I can sell an equivalent to an integrated system for half a million to a million-plus dollars,” said Ram Velaga, the senior vice president and general manager of the Core Switching Group at networking chip giant Broadcom.
“This is not going well at all with the cloud providers,” said Velaga. The others include Amazon, Alphabet’s Google and Meta and others. That is because those cloud giants’ economics are based on cutting costs as they scale computing resources, which dictates avoiding single-sourcing.
“And so now there’s this tension in this industry,” he said.
Broadcom claims that the best way to resolve this conflict is to move away from InfiniBand’s proprietary approach and follow the open networking approach of ethernet technology.
“There’s an engagement with us, saying, Hey, look, if the ethernet ecosystem can help address all the benefits that InfiniBand is able to bring to a GPU interconnect, and bring it onto a mainstream technology like ethernet, so it can be pervasively available, and create a very large networking fabric, it’s going to help people win on the merits of the GPU, rather than the merits of a proprietary network,” said Velaga.
Broadcom’s Tomahawk 5, which is now available in the market, replaces the Tomahawk 4, a 25.6 terabit per second chip from Broadcom.
By incorporating previously exclusive features to InfiniBand, the Tomahawk 5 component seeks to level the playing field. The primary distinction is latency, the average time to send the first data bit from point A to point B. When moving from the GPU to memory and back again – either to retrieve input data or to retrieve parameter data for massive neural networks in AI – latency has historically been an advantage for InfiniBand.
The latency difference between InfiniBand and Ethernet is reduced via a new technology known as RDMA over Converged Ethernet, or RoCE. With RoCE, an open standard wins out over the tight coupling of Nvidia GPUs and Infiniband.
“Once you get RoCE, there’s no longer that InfiniBand advantage,” said Velaga. “The performance of ethernet matches that of InfiniBand.”
“Our thesis is if we can out-execute InfiniBand, chip-to-chip, and you have an entire ecosystem looking for ethernet to be successful, you have a recipe to displace InfiniBand with ethernet and allow a broad ecosystem of GPUs to be successful,” said Velaga.
The reference to a large ecosystem of GPUs refers to the numerous rival silicon vendors in the AI business that provide cutting-edge chip architectures.
They include many startups such as Cerebras Systems, Graphcore, and SambaNova. They also include the cloud vendors’ silicon, such as Google’s Tensor Processing Unit, TPU, and Amazon’s Trainium chip. If processing resources weren’t reliant on a single Nvidia-sold network, all those efforts might theoretically have a better chance of success.
“The big cloud guys today are saying we want to build our GPUs, but we don’t have an InfiniBand fabric,” observed Velaga. “If you guys can give us an ethernet-equivalent fabric, we can do the rest of this stuff on our own.”
Broadcom believes that as the latency issue goes away, InfiniBand’s weaknesses will become evident, such as the number of GPUs it can support. “InfiniBand was always a system that had a certain scale limit, maybe a thousand GPUs, because it didn’t have a distributed architecture.”
Additionally, Ethernet switches may support Intel and AMD CPUs in addition to GPUs, so Velaga indicated that combining networking technology into one method offers certain financial advantages.
“I expect the fastest adoption of this market will come from GPU interconnect, and over a period of time, I probably would expect the balance will be fifty-fifty,” said Velaga, “because you will have the same technology that can be used for the CPU interconnect and the GPU interconnect, and the fact that there are far more CPUs sold than GPUs, you will have a normalization of the volume.” The GPUs will consume most of the bandwidth, while the CPUs may consume more ports on an ethernet switch.
In line with that vision, Velaga highlighted unique AI processing capabilities, such as a total of 256 ports of 200 gigabits per second ethernet ports, the most of any switch chip, as being in line with this aim. According to Broadcom, such a dense 200-gig port design is necessary to support “flat, low latency AI/ML clusters.”
Although Nvidia dominates the data center market, with sales of data center GPUs estimated to reach USD16 billion this year, the buyers, the cloud businesses, also wield considerable power, giving them the upper hand.
“The big cloud guys want this,” said Velaga of the pivot to ethernet from InfiniBand. “When you have these massive clouds with a lot of buying power, they have shown they are capable of forcing a vendor to disaggregate, and that is the momentum that we are riding,” said Velaga. “All of these clouds really do not want this, and they are insisting that the only way the GPU can be sold into them is with a standard NIC interface that can transmit over an ethernet.