Databricks switches to AMD GPUs to boost LLM training

Share post:

Databricks is switching to AMD GPUs to boost its large language model (LLM) training capabilities.

In a collaboration last year, Databricks joined forces with AMD to employ their 3rd Gen EPYC Instance processors. Subsequently, their acquisition of MosaicML, a company utilizing AMD MI250 GPUs for AI model training, further solidified their commitment to AMD’s capabilities.

AMD GPUs have been gaining traction in the AI community, with startups like Lamini and Moreh adopting AMD MI210 and MI250 systems for custom LLMs. Lamini recently disclosed that it runs its LLMs on AMD’s Instinct GPUs, while Moreh trained a language model with a staggering 221 billion parameters using 1200 AMD MI250 GPUs, receiving a $22 million investment.

This move is a testament to AMD’s growing prowess in the GPU space and the potential of its MI250 and MI300X GPUs for accelerating AI workloads. Databricks has achieved performance gains with AMD GPUs, recording a 1.13x improvement in training performance when using ROCm 5.7 and FlashAttention-2 compared to previous results with ROCm 5.4 and FlashAttention.

Databricks also successfully trained MPT-1B and MPT-3B models from scratch on 64 x MI250 GPUs, demonstrating the stability and scalability of AMD’s hardware and software stack.

The sources for this piece include an article in AnalyticsIndiaMag.

Featured Tech Jobs

SUBSCRIBE NOW

Related articles

Research Raises Concerns Over AI Impact on Code Quality

Recent findings from GitClear, a developer analytics firm, indicate that the increasing reliance on AI assistance in software...

Microsoft to train 100,000 Indian developers in AI

Microsoft has launched an ambitious program called "AI Odyssey" to train 100,000 Indian developers in artificial intelligence by...

NIST issues cybersecurity guide for AI developers

Paper identifies the types of cyberattacks that can manipulate the behavior of artificial intelligen

Canada, U.S. sign international guidelines for safe AI development

Eighteen countries, including Canada, the U.S. and the U.K., today agreed on recommended guidelines to developers in their nations for the secure design, development, deployment, and operation of artificial intelligent systems. It’s the latest in a series of voluntary guardrails that nations are urging their public and private sectors to follow for overseeing AI in

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways