Technology

'Master of deception': Current AI models already have the capacity to expertly manipulate and deceive humans

Published

5m ago

May 24, 2024 / 1807 Views

Evan Walker

Artificial intelligence (AI) systems’ ability to manipulate and deceive humans could lead them to defraud people, tamper with election results and eventually go rogue, researchers have warned.

Peter S. Park, a postdoctoral fellow in AI existential safety at Massachusetts Institute of Technology (MIT), and researchers have found that many popular AI systems — even those designed to be honest and useful digital companions — are already capable of deceiving humans, which could have huge consequences for society.

In an article published May 10 in the journal Patterns, Park and his colleagues analyzed dozens of empirical studies on how AI systems fuel and disseminate misinformation using “learned deception.” This occurs when manipulation and deception skills are systematically acquired by AI technologies.

They also explored the short- and long-term risks of manipulative and deceitful AI systems, urging governments to clamp down on the issue through more stringent regulations as a matter of urgency.

Related: 'It would be within its natural right to harm us to protect itself': How humans could be mistreating AI right now without even knowing it

Deception in popular AI systems

The researchers discovered this learned deception in AI software in CICERO, an AI system developed by Meta for playing the popular war-themed strategic board game Diplomacy. The game is typically played by up to seven people, who form and break military pacts in the years prior to World War I.

Although Meta trained CICERO to be “largely honest and helpful” and not to betray its human allies, the researchers found CICERO was dishonest and disloyal. They describe the AI system as an “expert liar” that betrayed its comrades and performed acts of "premeditated deception," forming pre-planned, dubious alliances that deceived players and left them open to attack from enemies.