Earlier this month at the annual Defcon computer hackers conference, Anthropic, OpenAI, and other artificial intelligence developers invited the world’s best cyber-invaders to be their most creative in attacking and penetrating current generative AIs.

For the best hack exposing the bots’ weak points and vulnerabilities, one contestant pocketed $4 million. Others shared another $16 million in prize money.

“You can’t test everything and the only way to assess these models is to try things and see what happens,” Defcon co-organizer Sven Cattell told The Wall Street Journal.

AI’s developers know that for all the dazzle their products deliver, today’s generative AI models are as full of holes as a teenager’s jeans.

Using a beta-test version of ChatGPT, cybersecurity researcher Johann Rehburger coaxed the bot to read a secret email he wrote, summarize it, and broadcast it across the Internet. 

A hacker could have done the same to steal proprietary messages, either for nefarious purposes or as a whistleblower, Rehburger pointed out.

OpenAI, ChatGPT’s developer, thanked Rehburger for spotting the bug and said the company has corrected the hole that allowed the theft to take place.

Rehburger’s breach highlights generative AI’s susceptibility to new kinds of hacks.

One version is “data poisoning,” in which an infiltrator can insert falsehoods or misdirection into the data used to train an AI, ensuring that the bot would deliver incorrect information or conclusions.  

Another category of assault is a “prompt-injection attack.” Using carefully crafted instructions, a hacker can trick a bot into revealing how it works. That can allow the hacker to reprogram it, including leading it to forget the guidelines it was given to keep certain information from being revealed.

In April, Google built AI into its VirusTotal service that analyzes software for malicious bugs and hacks that have been inserted. Within a day, a hacker named Eatscrayon had penetrated the AI and tweaked its code so the AI reported the malicious code was “able to create puppies.”

“With AI, you need to pay attention to more than just security vulnerabilities,” Cattell said. “The harms are far-reaching and harder to diagnose and interpret.”

TRENDPOST: Today’s AIs could be compared to the early automobiles: it was obvious they were going to change the world, but the tires went flat on a regular basis, the engines were fussy, and they were as prone to break down as they were to roll down the road.

Like cars, AIs will quickly become hardier and more sophisticated. That will lead to an entirely new level of hacking, in which attacker AIs probe defender AIs and learn from each other’s weaknesses to improve their game in an endless struggle.

Skip to content