Business

Scientists Develop New Algorithm to Spot AI ‘Hallucinations’

Published

5m ago

Jun 21, 2024 / 7003 Views

Evan Walker

An enduring problem with today’s generative artificial intelligence (AI) tools, like ChatGPT, is that they often confidently assert false information. Computer scientists call this behavior “hallucination,” and it’s a key barrier to AI’s usefulness.

Hallucinations have led to some embarrassing public slip-ups. In February, AirCanada was forced by a tribunal to honor a discount that its customer-support chatbot had mistakenly offered to a passenger. In May, Google was forced to make changes to its new “AI overviews” search feature, after the bot told some users that it was safe to eat rocks. And last June, two lawyers were fined $5,000 by a U.S. judge after one of them admitted he had used ChatGPT to help write a court filing. He came clean because the chatbot had added fake citations to the submission, which pointed to cases that never existed.

But in good news for lazy lawyers, lumbering search giants, and errant airlines, at least some types of AI hallucinations could soon be a thing of the past. New research, published Wednesday in the peer-reviewed scientific journal Nature, describes a new method for detecting when an AI tool is likely to be hallucinating. The method described in the paper is able to discern between correct and incorrect AI-generated answers approximately 79% of the time, which is approximately 10 percentage points higher than other leading methods. Although the method only addresses one of the several causes of AI hallucinations, and requires approximately 10 times more computing power than a standard chatbot conversation, the results could pave the way for more reliable AI systems in the near future.

“My hope is that this opens up ways for large language models to be deployed where they can't currently be deployed – where a little bit more reliability than is currently available is needed,” says Sebastian Farquhar, an author of the study, who is a senior research fellow at Oxford University’s department of computer Science, where the research was carried out, and is also a research scientist on Google DeepMind’s safety team. Of the lawyer who was fined for relying on a ChatGPT hallucination, Farquhar says: “This would have saved him.”

Hallucination has become a common term in the world of AI, but it is also a controversial one. For one, it implies that models have some kind of subjective experience of the world, which most computer scientists agree they do not. It suggests that hallucinations are a solvable quirk, rather than a fundamental and perhaps ineradicable problem of large language models (different camps of AI researchers disagree on the answer to this question). Most of all, the term is imprecise, describing several different categories of error.

Read More: The A to Z of Artificial Intelligence

Farquhar’s team decided to focus on one specific category of hallucinations, which they call “confabulations.” That’s when an AI model spits out inconsistent wrong answers to a factual question, as opposed to the same consistent wrong answer, which is more likely to stem from problems with a model’s training data, a model lying in pursuit of a reward, or structural failures in a model’s logic or reasoning. It’s difficult to quantify what percentage of all AI hallucinations are confabulations, Farquhar says, but it’s likely to be large. “The fact that our method, which only detects confabulations, makes a big dent on overall correctness suggests that a large number of incorrect answers are coming from these confabulations,” he says.

The methodology

The method used in the study to detect whether a model is likely to be confabulating is relatively simple. First, the researchers ask a chatbot to spit out a handful (usually between five and 10) answers to the same prompt. Then, they use a different language model to cluster those answers based on their meanings. For example, “Paris is the capital of France” and “France’s capital city is Paris” would be assigned to the same group because they mean the same thing, even though the wording of each sentence is different. “France’s capital city is Rome” would be assigned to a different group.

The researchers then calculate a number that they call “semantic entropy” – in other words, a measure of how similar or different the meanings of each answer are. If the model’s answers all have different meanings, the semantic entropy score would be high, indicating that the model is confabulating. If the model’s answers all have identical or similar meanings, the semantic entropy score will be low, indicating that the model is giving a consistent answer—and is therefore unlikely to be confabulating. (The answer could still be consistently wrong, but this would be a different form of hallucination, for example one caused by problematic training data.)

The researchers said the method of detecting semantic entropy outperformed several other approaches for detecting AI hallucinations. Those methods included "naive entropy," which only detects whether the wording of a sentence, rather than its meaning, is different; a method called "P(True)" which asks the model to assess the truthfulness of its own answers; and an approach called "embedding regression," in which an AI is fine-tuned on correct answers to certain questions. Embedding regression is effective at ensuring AIs accurately answer questions about specific subject matter, but fails when different kinds of questions are asked. One significant difference between the method described in the paper and embedding regression is that the new method doesn’t require sector-specific training data—for example, it doesn’t require training a model to be good at Science in order to detect potential hallucinations in answers to Science-related questions. This means it works with similar effects across different subject areas, according to the paper.

Farquhar has some ideas for how semantic entropy could begin reducing hallucinations in leading chatbots. He says it could in theory allow OpenAI to add a button to ChatGPT, where a user could click on an answer, and get a certainty score that would allow them to feel more confident about whether a result is accurate. He says the method could also be built-in under the hood to other tools that use AI in high-stakes settings, where trading off speed and cost for accuracy is more desirable.

While Farquhar is optimistic about the potential of their method to improve the reliability of AI systems, some experts caution against overestimating its immediate impact. Arvind Narayanan, a professor of computer Science at Princeton University, acknowledges the value of the research but emphasizes the challenges of integrating it into real-world applications. "I think it's nice research … [but] it's important not to get too excited about the potential of research like this," he says. "The extent to which this can be integrated into a deployed chatbot is very unclear."

Read More: Arvind Narayanan is on the TIME100 AI

Narayanan notes that with the release of better models, the rates of hallucinations (not just confabulations) have been declining. But he’s skeptical the problem will disappear any time soon. “In the short to medium term, I think it is unlikely that hallucination will be eliminated. It is, I think, to some extent intrinsic to the way that LLMs function,” he says. He points out that, as AI models become more capable, people will try to use them for increasingly difficult tasks where failure might be more likely. “There's always going to be a boundary between what people want to use them for, and what they can work reliably at,” he says. “That is as much a sociological problem as it is a technical problem. And I don't think it has a clean technical solution.”

Don't Miss

U.S. Cyclist Kristen Faulkner Wasn’t Supposed to Be in Paris. Now She’s a Gold Medalist

Latest
Trending

Celtic AGM 2024: Everything You Need to Know

Football4m ago / 5620 Views

Jada Pinkett Smith and Will Smith Heartbroken as Their Son Jaden Makes a SHOCKING Decision That Has Left Everyone Stunned!.ngocchau

50 Cent Drops Explosive Footage of Meek Mill – Diddy’s Jaw-Dropping Reaction Will Leave You Speechless!.ngocchau

Lifestyle14m ago / 8944 Views

Arsenal vs. Nottingham Forest lineup: Where to watch Premier League online, live stream, TV channel, odds

Football24m ago / 3704 Views

Where to watch Manchester City vs. Tottenham Hotspur live stream: Premier League online, TV, prediction, odds

Entertainment49m ago / 2545 Views

All About That Major Surprise Cameo in the Wicked Movie

Entertainment50m ago / 4916 Views

Everything to Know About the Grimmerie Spellbook in Wicked

Sports1h ago / 8384 Views

How many points do F1 drivers make for winning a race and do they win any money?

Health1h ago / 4435 Views

Teens Are Stuck on Their Screens. Here’s How to Protect Them

NFL1h ago / 4829 Views

Jacksonville Jaguars Trade Trevor Lawrence To NY Giants In Blockbuster Trade Proposal

NFL1h ago / 4457 Views

Arsenal vs. Nottingham Forest lineup: Where to watch Premier League online, live stream, TV channel, odds

Where to watch Manchester City vs. Tottenham Hotspur live stream: Premier League online, TV, prediction, odds

All About That Major Surprise Cameo in the Wicked Movie

Everything to Know About the Grimmerie Spellbook in Wicked

How many points do F1 drivers make for winning a race and do they win any money?

Teens Are Stuck on Their Screens. Here’s How to Protect Them

Jacksonville Jaguars Trade Trevor Lawrence To NY Giants In Blockbuster Trade Proposal

ESPN NFL Analyst Erin Dolan Called Out For Her Outfit Choice

TheFOXposts.Com

Business

Scientists Develop New Algorithm to Spot AI ‘Hallucinations’

The methodology

Celtic AGM 2024: Everything You Need to Know

Scott Brown on KT’s Potential Celtic Return and What Former Teammate Must Focus On

Harvey Elliott and Jayden Danns make full training return – but still no Federico Chiesa

Pep Guardiola set to sign Man City contract extension despite 115 charges

Jada Pinkett Smith and Will Smith Heartbroken as Their Son Jaden Makes a SHOCKING Decision That Has Left Everyone Stunned!.ngocchau

50 Cent Drops Explosive Footage of Meek Mill – Diddy’s Jaw-Dropping Reaction Will Leave You Speechless!.ngocchau

Sabres vs Ducks Prediction, Picks & Odds for Tonight’s NHL Game

Resident Evil like WW1 horror shooter Trench Tales kicks off new playtests

Trending

TheFOXposts.Com

Contact

Helpful Links

Breaking

Sports

Partners

EXCLUSIVES

Science

Find US

Social

Entertainment

Partners

TheFOXposts.Com

The methodology

You may like

Celtic AGM 2024: Everything You Need to Know

Scott Brown on KT’s Potential Celtic Return and What Former Teammate Must Focus On

Harvey Elliott and Jayden Danns make full training return – but still no Federico Chiesa

Pep Guardiola set to sign Man City contract extension despite 115 charges

Jada Pinkett Smith and Will Smith Heartbroken as Their Son Jaden Makes a SHOCKING Decision That Has Left Everyone Stunned!.ngocchau

50 Cent Drops Explosive Footage of Meek Mill – Diddy’s Jaw-Dropping Reaction Will Leave You Speechless!.ngocchau

Sabres vs Ducks Prediction, Picks & Odds for Tonight’s NHL Game

Resident Evil like WW1 horror shooter Trench Tales kicks off new playtests

Arsenal vs. Nottingham Forest lineup: Where to watch Premier League online, live stream, TV channel, odds

Where to watch Manchester City vs. Tottenham Hotspur live stream: Premier League online, TV, prediction, odds

All About That Major Surprise Cameo in the Wicked Movie

Everything to Know About the Grimmerie Spellbook in Wicked

How many points do F1 drivers make for winning a race and do they win any money?

Teens Are Stuck on Their Screens. Here’s How to Protect Them

Jacksonville Jaguars Trade Trevor Lawrence To NY Giants In Blockbuster Trade Proposal

ESPN NFL Analyst Erin Dolan Called Out For Her Outfit Choice

Trending

TheFOXposts.Com

Contact

Helpful Links

Breaking

Sports

Partners

EXCLUSIVES

Science

Find US

Social

Entertainment

Partners