GPT-4 breaks AI-guardian defense with natural language prompts

Share post:

Nicholas Carlini, a Google scientist, has demonstrated how OpenAI’s GPT-4 big language model may be used to circumvent AI-Guardian, a safeguard against adversarial attacks on machine learning models.

Carlini utilized GPT-4 to develop code capable of identifying the mask used by AI-Guardian to detect adversarial samples. This enabled Carlini to create hostile cases that could go around the defense.

By directing GPT-4 to create an attack method and explain its workings, Carlini revealed how the chatbot could compromise AI-Guardian’s detection capabilities. Specifically, GPT-4 produced Python code to manipulate images without triggering AI-Guardian’s suspicions. This ability to fool classifiers significantly reduced AI-Guardian’s robustness from 98 percent to a mere 8 percent.

The study reveals machine learning algorithms, such as image recognition systems, are vulnerable to adversarial examples—input that misleads the model’s identification process. Carlini’s revelation of the mask used to identify adversarial samples contradicted AI-Guardian’s technique of establishing a backdoor to reject hostile input, allowing the design of effective adversarial assaults.

“This work shows that GPT-4 can be used as a powerful tool for attacking machine learning models,” said Carlini. “It also raises concerns about the security of AI-Guardian and other similar defenses.”

The sources for this piece include an article in TheRegister.

SUBSCRIBE NOW

Related articles

Anthropic’s AI Agents Take a Big Leap: Direct Computer Control

Anthropic has unveiled a groundbreaking capability for its Claude large language model: the ability to directly interact with...

AI Agents Could Surpass Humans as Primary App Users by 2030, Accenture Predicts

AI agents are poised to transform the way we interact with digital systems, potentially becoming the primary users...

Target’s new AI is aimed at employees

Target is introducing a new generative artificial intelligence tool aimed at enhancing the efficiency of its store employees...

The good and the bad of AI generated code

Generative AI tools are transforming the coding landscape, making both skilled and novice developers more efficient. However, the...

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways