Internet data unknowingly contributes to the training of chatbots

Share post:

The internet and the enormous amount of data it has generated have had a tremendous influence on the advancement of artificial intelligence (AI). According to a recent Washington Post investigation, the AI industry trained its neural networks using a publicly available dataset spanning 30 years of web publication.

This investigation discovered that our online contributions, such as blogs, web pages, and social media threads, unknowingly helped AI chatbots learn. Moreover, humans unintentionally created a large archive of human expression, allowing AI models such as ChatGPT to do astounding sentence-completion tasks.

The study allows users to enter any internet domain name and determine its contribution to a specific AI training database. The researchers examined a database that had over 500,000 personal blogs, accounting for 3.8 percent of the total “tokens” in the dataset. However, because some cultures, groups, and subjects may be oversampled while others may be neglected, biases, limits, and poisonous parts of internet culture may be present in AI training data.

The immense quantity of information, thoughts, and emotions that people have created on the internet, which may be compared to digital stockpiles and landfills, is what is responsible for the developments in AI technology that we witness today.

The sources for this piece include an article in Axios.

SUBSCRIBE NOW

Related articles

Anthropic’s AI Agents Take a Big Leap: Direct Computer Control

Anthropic has unveiled a groundbreaking capability for its Claude large language model: the ability to directly interact with...

AI Agents Could Surpass Humans as Primary App Users by 2030, Accenture Predicts

AI agents are poised to transform the way we interact with digital systems, potentially becoming the primary users...

Target’s new AI is aimed at employees

Target is introducing a new generative artificial intelligence tool aimed at enhancing the efficiency of its store employees...

The good and the bad of AI generated code

Generative AI tools are transforming the coding landscape, making both skilled and novice developers more efficient. However, the...

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways