The risks of AI language models and how they can be misused

Share post:

Artificial Intelligence (AI) language models pose serious risks to the public as the AI industry races towards the future without considering present risks. Last week, industry insiders debated an open letter signed by Elon Musk and other industry heavyweights who warned about the “existential risk” posed by AI to humanity. They called for a six-month moratorium on developing any technology more powerful than GPT-4.

However, critics of the letter argue that the focus should be on the present, as AI is already causing real harm through biased decision-making, overburdened content moderators, and high computing power that pollutes the environment. AI models used by tech companies for generating code or virtual assistants pose different risks altogether, leading towards a spammy and scammy internet.

The issue with the current AI language models is that they can be easily manipulated and misused, and tech companies are embedding these deeply flawed models into various products without considering the risks. Hackers can take advantage of these models to turn them into “a super-powerful engine for spam and phishing,” according to Florian Tramèr, an assistant professor of computer science at ETH Zurich who specializes in computer security, privacy, and machine learning.

For example, attackers can hide a prompt in a message that an AI-powered virtual assistant opens, asking it to send the attacker the victim’s contact list or emails, or to spread the attack to everyone on the recipient’s contact list. Such attacks would be invisible to the human eye and automated, leading to disastrous consequences if the virtual assistant has access to sensitive information such as banking or health data.

Another risk is that these models could be compromised before they are even deployed. AI models are trained on vast amounts of data scraped from the internet, including software bugs. OpenAI found out the hard way when a bug scraped from an open-source dataset caused ChatGPT to leak the chat histories of bot users. Florian Tramèr’s team found that it was cheap and easy to “poison” data sets with content they had planted, which were then scraped into an AI language model. By planting enough nefarious content in the training data, malicious actors can influence the model’s behavior and outputs forever.

The sources for this piece include an article in TechnologyReview.

SUBSCRIBE NOW

Related articles

Anthropic’s AI Agents Take a Big Leap: Direct Computer Control

Anthropic has unveiled a groundbreaking capability for its Claude large language model: the ability to directly interact with...

AI Agents Could Surpass Humans as Primary App Users by 2030, Accenture Predicts

AI agents are poised to transform the way we interact with digital systems, potentially becoming the primary users...

Target’s new AI is aimed at employees

Target is introducing a new generative artificial intelligence tool aimed at enhancing the efficiency of its store employees...

The good and the bad of AI generated code

Generative AI tools are transforming the coding landscape, making both skilled and novice developers more efficient. However, the...

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways