xAI
xAI Colossus supercomputer will soon run 500k GPUs
Elon Musk, Founder and CEO of xAI, has announced a massive upgrade for Colossus supercomputer cluster, which will soon complete 500k GPUs installation, adding more power and performance to the AI chatbot Grok.
Colossus, xAI’s supercomputer cluster, became online in October 2024, located in Memphis, Tennessee, featuring 100,000 Nvidia H100 GPUs. It’s the world’s largest AI supercomputer that was built within 122 days and took only 19 days to start training the next-gen Grok model family.
During the Grok 3 model launch in January, xAI confirmed that it had added 100k GPUs to the cluster, doubling the GPU count in the next 92 days of completing the first phase.
Source – xAI
This year, the company has added 30k GB200 GPUs, which makes the total 230K GPUs available at the Colossus, and these are fully operational for training Grok. This was the development of Colossus 1, and its inference is completed by xAI’s cloud providers.
The Colossus 2 will add 270k GB200 and GB300 GPUs in the existing cluster, enabling the total GPU power to 500kW for training. The xAI leader confirmed that Colossus 2 will come online within a few weeks.
The additions provided by the Colossus 2 will bring training capability for next-gen coding model, video generations model and upcoming Grok family models.
(source)
