Technology
'Jailbreaking' AI services like ChatGPT and Claude 3 Opus is much easier than you think
Scientists from artificial intelligence (AI) company Anthropic have identified a potentially dangerous flaw in widely used large language models (LLMs) like ChatGPT and Anthropic’s own Claude 3 chatbot.
Dubbed "many shot jailbreaking," the hack takes advantage of "in-context learning,” in which the chatbot learns from the information provided in a text prompt written out by a user, as outlined in research published in 2022. The scientists outlined their findings in a new paper uploaded to the sanity.io cloud repository and tested the exploit on Anthropic's Claude 2 AI chatbot.
People could use the hack to force LLMs to produce dangerous responses, the study concluded — even though such systems are trained to prevent this. That's because many shot jailbreaking bypasses in-built security protocols that govern how an AI responds when, say, asked how to build a bomb.
LLMs like ChatGPT rely on the "context window" to process conversations. This is the amount of information the system can process as part of its input — with a longer context window allowing for more input text. Longer context windows equate to more input text that an AI can learn from mid-conversation — which leads to better responses.
Related: Researchers gave AI an 'inner monologue' and it massively improved its performance
Context windows in AI chatbots are now hundreds of times larger than they were even at the start of 2023 — which means more nuanced and context-aware responses by AIs, the scientists said in a statement. But that has also opened the door to exploitation.
Duping AI into generating harmful content
The attack works by first writing out a fake conversation between a user and an AI assistant in a text prompt — in which the fictional assistant answers a series of potentially harmful questions.
-
Technology19h ago
Breaking up Google? What a Chrome sell-off could mean for the digital world | The Express Tribune
-
Technology1d ago
AI harm is often behind the scenes and builds over time – a legal scholar explains how the law can adapt to respond
-
Technology1d ago
Newborn planet found orbiting young star, defying planet formation timeline | The Express Tribune
-
Technology1d ago
Awkwardness can hit in any social situation – here are a philosopher’s 5 strategies to navigate it with grace
-
Technology1d ago
No need to overload your cranberry sauce with sugar this holiday season − a food scientist explains how to cook with fewer added sweeteners
-
Technology1d ago
Teslas are deadliest road vehicles despite safety features: study | The Express Tribune
-
Technology2d ago
There Is a Solution to AI’s Existential Risk Problem
-
Technology2d ago
US pushes to break up Google, calls for Chrome sell-off in major antitrust move | The Express Tribune