Stanford researchers makes budget friendly ChatGPT AI

March 20, 2023

1 min.

Stanford researchers fine-tuned a seven-billion-parameter variant of Meta’s recently announced LLaMA model using 52,000 instruction-following demonstrations generated by OpenAI’s GPT-3.5 (text-davinci-003).

The Stanford group began with the Meta LLaMA 7B language model, which is the cheapest and smallest of the various LLaMA models available as open-source. Despite having some pre-existing capacity from being trained on a trillion tokens, this language model would fall far behind ChatGPT in most tasks. The primary value and competitive advantage of the GPT models is primarily due to the extensive time and human resources invested in post-training by OpenAI.

According to the researchers, the Stanford team used the AI-generated instructions to train Alpaca 7B, a language model that exhibits many GPT-3.5-like behaviours. In a blind test with input from the Self-Instruct Evaluation Set, both models performed comparably.

When the LLaMA 7B model became operational, the Stanford team asked GPT to take 175 human-written instruction/output pairs and generate 20 more in the same style and format, 20 at a time. This was done automatically using one of OpenAI’s APIs. The team then had 52,000 sample conversations to use for post-training the LLaMA model. After that, the data was used to fine-tune the LLaMA model, which took about three hours on eight A100 cloud processing computers with 80 GB of storage. This was under $100 USD.

Alpaca suffers from the same issues as other language models as a result of the similarities and training, such as hallucinations, toxicity, and stereotyping. Hallucinations, in particular, are more common in the OpenAI model.

The sources for this piece include an article in The-decoder.

Tags
AI

TND Newsdesk

Featured Tech Jobs

SUBSCRIBE NOW

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways

Subscribe Now

Cyber Security Today, Week in Review for week ending Friday, April 26, 2024

Cyber Security Today, April 26, 2024 – Patch warnings for Cisco ASA gateways and a WordPress plugin

Cyber Security Today, April 24, 2024 – Good news/bad news in Mandiant report, UnitedHealth admits paying a ransomware gang, and more

Google Play introduces new biometric verification with a user warning

Google Play introduces new biometric verification with a user warning

Early adopters returning Apple Vision Pro headsets

Apple Vision Pro turning up in unusual and unsafe usage

Google release fix for Pixel bug but requires developer level skills and tools to execute

Apple reduces forecasts for Vision Pro as demand cools in key US market

Cyber Security Today, April 22, 2024 -Vulnerability in CrushFTP file transfer software, security updates for Cisco’s controller management application, and more

Cyber Security Today, April 22, 2024 -Vulnerability in CrushFTP file transfer software, security updates for Cisco’s controller management application, and more

Broadcom backs down on VMWare pricing: Hashtag Trending for Wednesday, April 17, 2024

Is OpenAI critical infrastructure? Hashtag Trending, Friday April 26, 2024

Spotify CEO confesses to “rough times after layoffs” – stock price rises

IBM acquires HashiCorp in strategic purchase – investors unimpressed

Zuckerberg shares his vision with investors and Meta stock tanks

Stanford researchers makes budget friendly ChatGPT AI

Featured Tech Jobs

Cyber Security Today, April 26, 2024 – Patch warnings for Cisco ASA gateways and a WordPress plugin

Is OpenAI critical infrastructure? Hashtag Trending, Friday April 26, 2024

Spotify CEO confesses to “rough times after layoffs” – stock price rises

IBM acquires HashiCorp in strategic purchase – investors unimpressed

Zuckerberg shares his vision with investors and Meta stock tanks

SUBSCRIBE NOW

Related articles

Zuckerberg shares his vision with investors and Meta stock tanks

AI surpasses human benchmarks in most areas: Stanford report

Microsoft and OpenAI partner to build a $100 Billion AI supercomputer “Stargate”

US Bill Aims to Unveil AI Training Data Sources Amid Copyright Concerns

Become a member

Most Popular Categoires

Tech News Delivered