Nvidia’s top-of-the-line chips are in high demand and short supply.
What’s new: There aren’t enough H100 graphics processing units (GPUs) to meet the crush of demand brought on by the vogue for generative AI, VentureBeat reported.
Bottleneck: Cloud providers began having trouble finding GPUs earlier this year, but the shortfall has spread to AI companies large and small. SemiAnalysis, a semiconductor market research firm, estimates that the chip will remain sold out into 2024.
- TSMC, which fabricates Nvidia’s designs, can produce only so many H100s. Its high-end chip packaging technology, which is shared among Nvidia, AMD, and other chip designers, currently has limited capacity. The manufacturer expects to double that capacity by the end of 2024.
- Nvidia executive Charlie Boyle downplayed the notion of a shortage, saying that cloud providers had presold much of their H100 capacity. As a result, startups that need access to thousands of H100s to train large models and serve a sudden swell of users have few options.
- An individual H100 with memory and high-speed interface originally retailed for around $33,000. Second-hand units now cost between $40,000 and $51,000 on eBay.
Who’s buying: Demand for H100s is hard to quantify. Large AI companies and cloud providers may need tens of thousands to hundreds of thousands of them, while AI startups may need hundreds to thousands.
- The blog gpus.llm-utils.org ballparked current demand at around 430,000 H100s, which amounts to roughly $15 billion in sales. The author said the tally is a guess based on projected purchases by major AI companies, AI startups, and cloud providers. It omits Chinese companies and may double-count chips purchased by cloud providers and processing purchased by cloud customers.
- Chinese tech giants Alibaba, Baidu, ByteDance, and Tencent ordered $5 billion worth of Nvidia chips, the bulk of them to be delivered next year, the Financial Times reported.
- CoreWeave, a startup cloud computing provider, ordered between 35,000 and 40,000 H100s. It has a close relationship with Nvidia, which invested in its recent funding round, and it secured a $2.3 billion loan — using H100 chips as collateral — to finance construction of data centers that are outfitted to process AI workloads.
- Machine learning startup Inflection AI plans to have 22,000 H100s by December.
Behind the news: Nvidia announced the H100 early last year and began full production in September. Compared to its predecessor, the A100, the H100 performs about 2.3 times faster in training and 3.5 times faster at inference.
Why it matters: Developers need these top-of-the-line chips to train high-performance models and deploy them in cutting-edge products. At a time when AI is white-hot, a dearth of chips could affect the pace of innovation.
We’re thinking: Nvidia’s CUDA software, which undergirds many deep learning software packages, gives the company’s chips a significant advantage. However, AMD’s open source ROCm is making great strides, and its MI250 and upcoming MI300-series chips appear to be promising alternatives. An open software infrastructure that made it easy to choose among GPU providers would benefit the AI community.