Open source models race to beat GPT-4 on coding tasks

Share post:

Two open source models, WizardCoder 34B by Wizard LM and CodeLlama-34B by Phind, have been released in the last few days. Both models are based on Code Llama, a large language model (LLM) developed by Meta.

Wizard LM claims that WizardCoder 34B outperformed GPT-4, ChatGPT-3.5, and Claude-2 on HumanEval, a benchmark for evaluating the coding abilities of LLMs. However, it appears that Wizard LM compared WizardCoder 34B’s score to the HumanEval rating of GPT-4’s March version, rather than the August version, where GPT-4 achieved an 82%.

Phind also claims that their fine-tuned versions, CodeLlama-34B and CodeLlama-34B-Python, achieved pass rates of 67.6% and 69.5% on HumanEval, respectively. These numbers are almost equivalent to GPT-4’s.

The open source community is said to be obsessed with beating GPT-4, which is considered to be the ultimate benchmark for LLMs. Meta on its own is creating models meant for specific tasks, and they are trying to surpass GPT-4 in those particular tasks.

HumanEval benchmark may not be a perfect measure of the coding abilities of LLMs. Factors like code explanation, docstring generation, code infilling, SO questions, and writing tests are not captured by HumanEval.

OpenAI on its own has not released any details about the training data or evaluation metrics used for GPT-4. This has led some to speculate that OpenAI is holding back its trade secrets in order to maintain its lead in the LLM market.

The sources for this piece include an article in AnalyticsIndiaMag.

Featured Tech Jobs


Related articles

Toyota AI teaches robots to make breakfast

Toyota Research Institute (TRI) has used generative AI to teach robots to make breakfast, or at least, the...

Google’s Bard chatbot gets new features

Google's Bard chatbot has received a major update that gives users the ability to double-check its answers and...

Microsoft AI researchers accidentally leak 38TB of data

Microsoft AI researchers accidentally leaked 38TB of sensitive data, including backups of personal information belonging to Microsoft employees....

Tech giants call for regulation of artificial intelligence

Tech giants such as Tesla, Meta, Google, and Microsoft have called for regulation of artificial intelligence (AI), following...

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways