Internet data unknowingly contributes to the training of chatbots

April 25, 2023

1 min.

The internet and the enormous amount of data it has generated have had a tremendous influence on the advancement of artificial intelligence (AI). According to a recent Washington Post investigation, the AI industry trained its neural networks using a publicly available dataset spanning 30 years of web publication.

This investigation discovered that our online contributions, such as blogs, web pages, and social media threads, unknowingly helped AI chatbots learn. Moreover, humans unintentionally created a large archive of human expression, allowing AI models such as ChatGPT to do astounding sentence-completion tasks.

The study allows users to enter any internet domain name and determine its contribution to a specific AI training database. The researchers examined a database that had over 500,000 personal blogs, accounting for 3.8 percent of the total “tokens” in the dataset. However, because some cultures, groups, and subjects may be oversampled while others may be neglected, biases, limits, and poisonous parts of internet culture may be present in AI training data.

The immense quantity of information, thoughts, and emotions that people have created on the internet, which may be compared to digital stockpiles and landfills, is what is responsible for the developments in AI technology that we witness today.

The sources for this piece include an article in Axios.

Tags
AI

TND Newsdesk

SUBSCRIBE NOW

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways

Subscribe Now

Cyber Security Today, Week in Review for week ending Friday May 17, 2024

Cyber Security Today, May 17, 2024 – Malware hiding in Apache Tomcat servers

MIT students exploit blockchain vulnerability to steal 25 million dollars

Cyber Security Today, May 15, 2024 – Ebury botnet still exploits Linux servers, Microsoft, SAP and Apple issue security updates

iOS update brings back photos users thought were permanently deleted

Microsoft reveals critical security flaw affecting Android apps

Google Play introduces new biometric verification with a user warning

Early adopters returning Apple Vision Pro headsets

Resignations at OpenAI. Hashtag Trending for Friday, May 17, 2024

Google does the unthinkable – reportedly erasing a 125 billion dollar pension fund

MIT students exploit blockchain vulnerability to steal 25 million dollars

iOS update brings back photos users thought were permanently deleted

Internet data unknowingly contributes to the training of chatbots

Cyber Security Today, Week in Review for week ending Friday May 17, 2024

Cyber Security Today, May 17, 2024 – Malware hiding in Apache Tomcat servers

Resignations at OpenAI. Hashtag Trending for Friday, May 17, 2024

Google does the unthinkable – reportedly erasing a 125 billion dollar pension fund

MIT students exploit blockchain vulnerability to steal 25 million dollars

SUBSCRIBE NOW

Related articles

Microsoft’s AI success may spell defeat for it’s climate goals

OpenAI’s Chief Scientist Ilya Sutskever Departs Company

OpenAI snubs Microsoft, launching GPT-4o only on macOS

Apple to integrate ChatGPT into iPhones

Become a member