Meta unveils new AI-powered LLaMA models

February 27, 2023

1 min.

Meta has announced the release of a new large language model that can run on a single graphics processing unit (GPU) rather than a cluster of GPUs. LLaMA-13B is a new AI-powered large language model (LLM) that can outperform OpenAI’s GPT-3 model despite being “10x smaller.”

The new model is a collection of language models with parameters ranging from 7 billion to 65 billion. In comparison, OpenAI’s GPT-3 model, which serves as the foundation for ChatGPT, has 175 billion parameters. LLaMA is not a chatbot in the traditional sense; it is a research tool that, according to Meta, will likely solve problems with AI language models. It was trained using publicly available datasets such as Common Crawl, Wikipedia, and C4, which means the company could potentially open source the model and weights.

Smaller models trained on more tokens (word fragments) are easier to retrain and fine-tune for specific potential product use cases, according to Meta. As a result, LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens. LLaMA 7B, its smallest model, is trained on one trillion tokens.

It competes with similar offerings from rival AI labs DeepMind, Google, and OpenAI. It is also said to outperform GPT-3 when measured across eight standard “common sense reasoning” benchmarks such as BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC, and OpenBookQA while running on a single GPU. LLaMA-13B, in contrast to the data center requirements for GPT-3 derivatives, paves the way for ChatGPT-like performance on consumer-level hardware in the near future.

“Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field,” said Meta in its official blog.

Meta refers to its LLaMA models as “foundational models,” implying that the company intends for the models to serve as the foundation for future, more refined AI models built on the technology, similar to how OpenAI built ChatGPT on a foundation of GPT-3. LLaMA, according to the company, will be useful in natural language research and potentially power applications such as “question answering, natural language understanding or reading comprehension, understanding capabilities and limitations of current language models.”

The sources for this piece include an article in ArsTechnica.

Tags
meta

TND Newsdesk

SUBSCRIBE NOW

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways

Subscribe Now

North Korean hacker infiltrates US security vendor, loads malware

CrowdStrike releases an update from initial Post Incident Review: Hashtag Trending Special Edition for Thursday July 25, 2024

Security vendor CrowdStrike issues an update from their initial Post Incident Review

CrowdStrike CEO summoned by Homeland Security committee over software disaster

Canadian schools sue social media giants over alleged harm to children

ChatGPT mobile mania: Why users are flocking to ChatGPT Plus

iOS update brings back photos users thought were permanently deleted

Microsoft reveals critical security flaw affecting Android apps

CrowdStrike faces backlash over $10 “apology” voucher

North Korean hacker infiltrates US security vendor, loads malware

Security company accidentally hires a North Korean state hacker: Cybersecurity Today for Friday, July 26, 2024

Security vendor CrowdStrike issues an update from their initial Post Incident Review

Meta unveils new AI-powered LLaMA models

North Korean hacker infiltrates US security vendor, loads malware

Security company accidentally hires a North Korean state hacker: Cybersecurity Today for Friday, July 26, 2024

CrowdStrike releases an update from initial Post Incident Review: Hashtag Trending Special Edition for Thursday July 25, 2024

Security vendor CrowdStrike issues an update from their initial Post Incident Review

Homeland Security committee demands appearance by CrowdStrike CEO

SUBSCRIBE NOW

Related articles

Target’s new AI is aimed at employees

The good and the bad of AI generated code

Microsoft’s AI success may spell defeat for it’s climate goals

OpenAI’s Chief Scientist Ilya Sutskever Departs Company

Become a member