Hugging Face partners ServiceNow for StarCoder and StarCoderBase

May 15, 2023

1 min.

Hugging Face and ServiceNow’s BigCode partnership is making great progress in the development of large programming language models (LLMs) with the development of StarCoder and StarCoderBase, with an emphasis on ethical principles.

StarCoder and StarCoderBase were developed in collaboration with GitHub and trained on its freely licensed data set, which includes over 80 programming languages, Git commits, GitHub problems, and Jupyter notebooks.

StarCoder was trained with 1 trillion tokens and has a 8,192-token context window. It creates realistic code and works with a variety of programming languages. It is distributed under the OpenRAIL-M license, which places legal restrictions on its usage and modification. Furthermore, like other LLMs, StarCoder has the potential to generate inaccurate or biased information, and it is critical to recognize these limitations and strive toward overcoming them.

While the StarCoderBase model surpasses other open Code LLMs in numerous prominent programming benchmarks, it is on par with, if not better than, closed models like as OpenAI’s code-Cushman-001. Its context length, which exceeds 8,000 tokens, enables it to process more input than any other open LLM now available.

The researchers also disclosed OpenRAIL license of the model’s code, which includes intermediate checkpoints. Furthermore, all training and preprocessing code is released under the Apache 2.0 license. A thorough framework for testing computer programs, a new dataset for training and assessing PII-removal methods, and a tool to identify the source of the produced code inside the dataset are among the additional materials made accessible.

The sources for this piece include an article in MarkTechPost.

Tags
Development

TND Newsdesk

SUBSCRIBE NOW

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways

Subscribe Now

North Korean hacker infiltrates US security vendor, loads malware

CrowdStrike releases an update from initial Post Incident Review: Hashtag Trending Special Edition for Thursday July 25, 2024

Security vendor CrowdStrike issues an update from their initial Post Incident Review

CrowdStrike CEO summoned by Homeland Security committee over software disaster

Canadian schools sue social media giants over alleged harm to children

ChatGPT mobile mania: Why users are flocking to ChatGPT Plus

iOS update brings back photos users thought were permanently deleted

Microsoft reveals critical security flaw affecting Android apps

CrowdStrike faces backlash over $10 “apology” voucher

North Korean hacker infiltrates US security vendor, loads malware

Security company accidentally hires a North Korean state hacker: Cybersecurity Today for Friday, July 26, 2024

Security vendor CrowdStrike issues an update from their initial Post Incident Review

Hugging Face partners ServiceNow for StarCoder and StarCoderBase

North Korean hacker infiltrates US security vendor, loads malware

Security company accidentally hires a North Korean state hacker: Cybersecurity Today for Friday, July 26, 2024

CrowdStrike releases an update from initial Post Incident Review: Hashtag Trending Special Edition for Thursday July 25, 2024

Security vendor CrowdStrike issues an update from their initial Post Incident Review

Homeland Security committee demands appearance by CrowdStrike CEO

SUBSCRIBE NOW

Related articles

Big-Box Stores moving away from self-service

CFPB proposes regulation of digital payments

MapleSEC: How Kyndryl built cyber resiliency into its new IT infrastructure

ECB launches two-year preparation phase for digital Euro

Become a member