Honestly, I love when AI hallucinates

Josh Tyrangiel

Washington Post

20
27.12.2023

By Josh Tyrangiel

Columnist

December 27, 2023 at 7:00 a.m. EST

(Ann Kiernan for The Washington Post)

Listen8 min

Comment on this storyComment

Add to your saved stories

Save

It’s your wedding day. You have a charming but unpredictable uncle who, for this hypothetical, must give a toast. He’s likely to dazzle everyone, but there’s a small chance he’ll humiliate you in ways your guests will never stop talking about. Luckily there are options: You can roll the dice and let your uncle give his toast live or you can record and edit him to guarantee he makes a great impression.

Need something to talk about? Text us for thought-provoking opinions that can break any awkward silence.ArrowRight

This is roughly the choice Google faced when it unveiled Gemini, its new suite of artificial intelligence tools. Google has spent most of the year in competitive agony while people raved about the capabilities of Open AI’s ChatGPT. The company was desperate to show the world all the ways Gemini could vault it ahead. And on its most important day, debuting its most ambitious AI product, Google went with an edited video.

Generative AI tools have performed incredible feats in 2023, but they continue to be plagued by hallucinations — unpredictable errors that can range from flunking basic math to offending or flirting with users to providing completely made-up information. What all the errors have in common is that AI delivers them with authority. To users, a hallucination can feel like gaslighting.

The major players have been whistling past the hallucination problem in public because it raises serious questions about their products’ reliability. Also: That word is so devastating. Can you imagine a better wrapper for humanity’s fears about AI? Hallucination. “Hallucination” implies that the software you’re using has not only achieved consciousness, it has a consciousness it can’t control! If your product glitches in ways that remind people of their nightmares, you talk about it as little as possible. You definitely stay away from live demos.

5 questions about artificial intelligence, answered

There are a lot of disturbing examples of hallucinations, but the ones I’ve encountered aren’t scary. I actually enjoy them. (I also like watching drunk uncles at weddings.) Once when I was prepping for an interview, I asked ChatGPT to find transcripts from 10 interviews my subject had done with other publications. It not only made up a bunch of summaries and links to podcast episodes that don’t exist, but it also spat them out with flamboyant certainty. It was like talking to a George Santos bot. Google’s Bard and Anthropic’s Claude, on the other hand, are more prone to taking a bad bit of logic or a slightly imprecise prompt and building on it, “Yes, and-ing” me into oblivion like terrible improv comics.

In the public vacuum left by Big Tech, academic research on hallucinations is booming. “We have a joke in our lab that you can’t take even an hour off or someone else will publish first,” says Vipula Rawte, a PhD student in computer science at the Artificial Intelligence Institute of University of South Carolina. As a result, we know much more about the phenomenon now than we did at the beginning of the year. The most obvious thing is that they’re not hallucinations at all. Just bugs specific to the world’s most complicated software.

Large language models are big probability machines with two tasks. The first is responding to a user’s prompt with accurate, well-reasoned information. Ask such a program the value of two dimes and one nickel, and it returns an answer by searching its training data, recognizing the monetary value of each coin and adding them up. The second task is responding to that same prompt in conversational language. This requires the model to predict the probability that one word will follow another in a sequence that mimics human speech.

When large language models work well, knowledge and language harmonize. “Two dimes and one nickel are worth a total of 25 cents.” When they don’t? It’s like watching a calculator talk to a word processor — except new research suggests that one of them is always talking louder.

William Merrill studies large........

visit website

Categories

Sources

Popular

Honestly, I love when AI hallucinates

Josh Tyrangiel

© Washington Post