Elon Musk plans to build a supercomputer using NVIDIA's semiconductor chips

The Tesla and SpaceX CEO wants to get it running by the fall of 2025. His artificial intelligence startup xAI could partner with Oracle to develop the computer

Naandika Tripathi
Published: May 27, 2024 04:32:14 PM IST
Updated: Oct 25, 2024 01:19:49 PM IST

Elon Musk. Image: Getty ImagesElon Musk. Image: Getty Images

Tesla and SpaceX CEO Elon Musk recently told investors that his artificial intelligence (AI) startup xAI will be working on building a supercomputer that will power the next version of its AI chatbot Grok. He will need 100,000 semiconductors to train and run them, The Information reported on May 25.

Musk said he wants to get this supercomputer running by the fall of 2025, adding that xAI could partner with Oracle to develop the computer. Reportedly, the startup has been talking with Oracle about spending $10 billion to rent cloud servers. xAI is the largest H100 customer at Oracle, using over 15,000 AI chips made by NVIDIA. Even Musk’s Tesla has been using NVIDIA-powered supercomputers to produce its electric vehicles.

The xAI supercomputer that’s labelled as the “gigafactory of compute” by Musk plans to string 100,000 chips into a single, massive computer. When completed, the connected groups of chips—Nvidia’s flagship H100 graphics processing units (GPUs)—would be at least four times the size of the biggest GPU clusters that exist today, The Information stated.

Supercomputers operate at extremely high speeds relative to all other computers. They've been essential in advancing the boundaries of science over the years. It can be used to create sophisticated AI models that can learn from trillions of examples, speak different languages, flawlessly analyse text, images, and video together, create augmented reality tools, and more. Some of the applications for supercomputers include genomic sequencing, space exploration, weather forecasting and climate research, nuclear fusion research, and more.

Elon Musk founded xAI in July 2023 as a contender to Microsoft-backed OpenAI and Alphabet's Google. On May 26, xAI raised $6 billion in Series B funding, reaching a post-money valuation of $24 billion. The funding round was backed by investors Andreessen Horowitz and Sequoia Capital, among others, the company said in a blog post.

Read More

The money will be used to take xAI's first products to market, build advanced infrastructure, and accelerate research and development of future technologies, xAI said. Following this, Musk said on X, "There will be more to announce in the coming weeks.”

Nvidia's H100 GPUs lead the data centre chip market for AI and are difficult to obtain due to high demand. The company already has the world’s top tech companies, Microsoft and Meta, queued up for its new Blackwell chip.

Listen Now: Reliance, Tata strike deep AI partnerships with Nvidia — what can we expect?

Back home, data centre startup Yotta placed an order for 16,000 H100 chips, including the newly announced Blackwell AI-training GPU, in September 2023. The first batch of 4,000 chips arrived in March, comprising NVIDIA H100 Tensor Core GPUs. The Mumbai-based venture will offer managed cloud services along with the ability for enterprises to use Yotta’s cloud for training large language models (LLMs) and building applications like OpenAI’s ChatGPT.

"When all the 16,000 H100 will be connected through the InfiniBand network, it will become the ninth or tenth largest supercomputer in the world," Sunil Gupta, co-founder and CEO of Yotta, tells Forbes India. ChatGPT-4 was trained on 12,000 A100 chips. H100 is four times more powerful than that. "So essentially, a GPT-4, which is the largest LLM to date in the world, was trained on 3,000 equivalent H100s," adds Gupta.

Right now, India is lagging in terms of compute that is available for the country, explains Akshara Bassi, senior research analyst at Counterpoint Technology Market Research. “We have about 25 petaflops of supercomputing capacity. This is very low. The government is trying to expand this. Recently, they put aside $1.2 billion to develop computing infrastructure. But that’s quite less in comparison to the investment made by other countries.”

X