Technology

Google Left in ‘Terrible Bind’ by Pulling AI Feature After Right-Wing Backlash

Published

8m ago

Mar 01, 2024 / 9174 Views

Evan Walker

February was shaping up to be a banner month for Google’s ambitious artificial intelligence strategy. The company rebranded its chatbot as Gemini and released two major product upgrades to better comPete with rivals on all sides in the high-stakes AI arms race. In the midst of all that, Google also began allowing Gemini users to generate realistic-looking images of people.

Not many noticed the feature at first. Other companies like OpenAI already offer tools that let users quickly make images of people that can then be used for marketing, art and brainstorming creative ideas. Like other AI products, though, these image-generators run the risk of perpetuating biases based on the data they’ve been fed in the development process. Ask for a nurse and some AI services are more likely to show a woman; ask for a chief executive and you’ll often see a man.

Within weeks of Google launching the feature, Gemini users noticed a different problem. Starting on Feb. 20 and continuing throughout the week, users on X flooded the social media platform with examples of Gemini refraining from showing White people — even within a historical context where they were likely to dominate depictions, such as when users requested images of the Founding Fathers or a German soldier from 1943. Before long, public figures and news outlets with large right-wing audiences claimed, using dubious evidence, that their tests of Gemini showed Google had a hidden agenda against White people.

Elon Musk, the owner of X, entered the fray, engaging with dozens of posts about the unfounded conspiracy, including several that singled out individual Google leaders as alleged architects of the policy. On Thursday, Google paused Gemini’s image generation of people. The next day, Google senior vice president Prabhakar Raghavan published a blog post attempting to shed light on the company’s decision, but without explaining in depth why the feature had faltered.

Google’s release of a product poorly equipped to handle requests for historical images demonstrates the unique challenge tech companies face in preventing their AI systems from amplifying bias and misinformation — especially given competitive pressure to bring AI products to market quickly. Rather than hold off on releasing a flawed image generator, Google attempted a Band-Aid solution.

When Google launched the tool, it included a technical fix to reduce bias in its outputs, according to two people with knowledge of the matter, who asked not to be identified discussing private information. But Google did so without fully anticipating all the ways the tool could misfire, the people said, and without being transparent about its approach.

Google’s overcorrection for AI’s well-known bias against people of color left it vulnerable to yet another firestorm over diversity. The tech giant has faced criticisms over the years for mistakenly returning images of Black people when users searched for “gorillas” in its Photos app as well as a protracted public battle over whether it acted appropriately in ousting the leaders of its ethical AI team.

In acting so quickly to pause this tool, without adequately unpacking why the systems responded as they did, Googlers and others in Silicon Valley now worry that the company’s move will have a chilling effect. They say it could discourage talent from working on questions of AI and bias — a crucial issue for the field.

“The tech industry as a whole, with Google right at the front, has again put themselves in a terrible bind of their own making,” said Laura Edelson, an assistant professor at Northeastern University who has studied AI systems and the flow of information across large online networks. “The industry desperately needs to portray AI as magic, and not stochastic parrots,” she said, referring to a popular metaphor that describes how AI systems mimic human language through statistical pattern matching, without genuine understanding or comprehension. “But parrots are what they have.”

“Gemini is built as a creativity and productivity tool, and it may not always be accurate or reliable,” a spokesperson for Google said in a statement. “We’re continuing to quickly address instances in which the product isn’t responding appropriately.”

In an email to staff late on Tuesday, Google Chief Executive Officer Sundar Pichai said employees had been “working around the clock” to remedy the problems users had flagged with Gemini’s responses, adding that the company had registered “a substantial improvement on a wide range of prompts.”

“I know that some of its responses have offended our users and shown bias – to be clear, that’s completely unacceptable and we got it wrong,” Pichai wrote in the memo, which was first reported by Semafor. “No AI is perfect, especially at this emerging stage of the industry’s development, but we know the bar is high for us and we will keep at it for however long it takes. And we’ll review what happened and make sure we fix it at scale.”

Googlers working on ethical AI have struggled with low morale and a feeling of disempowerment over the past year as the company accelerated its pace of rolling out AI products to keep up with rivals such as OpenAI. While the inclusion of people of color in Gemini images showed consideration of diversity, it suggested the company had failed to fully think through the different contexts in which users might seek to create images, said Margaret Mitchell, the former co-head of Google’s Ethical AI research group and chief ethics Scientist at the AI startup Hugging Face. A different consideration of diversity may be appropriate when users are searching for images of how they feel the world should be, rather than how the world in fact was at a particular moment in History.

“The fact that Google is paying attention to skin tone diversity is a leaps-and-bounds advance from where Google was just four years ago. So it’s sort of like, two steps forward, one step back,” Mitchell said. “They should be recognized for actually paying attention to this stuff. It’s just, they needed to go a little bit further to do it right.”

Google’s image problem

For Google, which pioneered some of the techniques at the heart of today’s AI boom, there has long been immense pressure to get image generation right. Google was so concerned about how people would use Imagen, its AI image-generation model, that it declined to release the feature to the public for a prolonged period after first detailing its capabilities in a research paper in May 2022.

Over the years, teams at the company debated over how to ensure that its AI tool would be responsible in generating photorealistic images of people, said two people familiar with the matter, who asked not to be identified relaying internal discussions. At one point, if employees experimenting internally with Google’s Imagen asked the program to generate an image of a human — or even one that implicitly included people, such as a Football stadium — it would respond with a black box, according to one person. Google included the ability to generate images of people in Gemini only after conducting multiple reviews, another person said.

Google did not carry out testing of all the ways that the feature might deliver unexpected results, one person said, but it was deemed good enough for the first version of Gemini’s image-generation tool that it made widely available to the public. Though Google’s teams had acted cautiously in creating the tool, there was a broad sense internally that the company had been unprepared for this type of fallout, they said.

As users on X circulated images of Gemini’s ahistorical depictions of people, Google’s internal employee forums were ablaze with posts about the model’s shortcomings, according to a current employee. On Memegen, an internal forum where employees share memes poking fun at the company, one popular post featured an image of TV host Anderson Cooper covering his face with his hands.

“It’s a face palm,” the employee said. “There’s a sense that this is clearly not ready for prime time… that the company is in fact trying to play catch up.”

Google, OpenAI and others build guardrails into their AI products and often conduct adversarial testing — meant to probe how the tools would respond to potential bad actors — in order to limit potentially problematic outputs, such as violent or offensive content. They also employ a number of methods to counteract biases found in their data, such as by having humans rate the responses a chatbot gives. Another method, which some companies use for software that generates images, is to expand on the specific wording of prompts that users feed into the AI model to counteract damaging stereotypes — sometimes without telling users.

Two people familiar with the matter said Google’s image generation works in this way, though users aren’t informed of it. The approach is sometimes referred to as prompt engineering or prompt transformation. A recent Meta white paper on building generative AI responsibly explained it as “a direct modification of the text input before it is sent to the model, which helps to guide the model behavior by adding more information, context, or constraints.”

Take the example of asking for an image of a nurse. Prompt engineering “can provide the model with additional words or context, such as updating and randomly rotating through prompts that use different qualifiers, such as ‘nurse, male’ and ‘nurse, female,’” according to the Meta white paper. That’s precisely what Google’s AI does when it is asked to generate images of people, according to people familiar — it may add a variety of genders or races to the original prompt without users ever seeing that it did, subverting what would have been a stereotypical output produced by the tool.

“It’s a quick technical fix,” said Fabian Offert, an assistant professor at the University of California, Santa Barbara, who studies digital humanities and visual AI. “It’s the least computationally expensive way to achieve some part of what they want.”

OpenAI takes a similar approach with its image-generation software. When users ask ChatGPT to create a picture with its Dall-E 3 image generation software, for instance, the written prompt a user types is automatically elaborated on by OpenAI’s software. ChatGPT users can see the more detailed prompt it actually uses, if they’re accessing the chatbot via OpenAI’s website. But Google doesn’t make it easily accessible for Gemini users to see what’s happening behind the curtain.

Google’s decision to be secretive about its image generation process was a mistake, Offert said. Yet as he pointed out, such efforts — whether obscured or revealed to users — won’t fix fundamental issues that stem from the data that cutting-edge AI systems are typically trained on.

“Because they’re trained on scraped web data that’s incredibly biased and inherently biased, it’s trash, in a political sense, so they have to manipulate it in some form into producing less trashy stuff,” Offert said.

What’s more, the purpose of AI image generation systems isn’t to create historically accurate images, said Emily Bender, a professor of computational linguistics at the University of Washington. “If the image is synthetic, it is not an accurate historical representation,” she said. Generative AI systems are also “unscoped,” meaning they haven’t been developed for any particular purpose, Bender added. It’s impossible to anticipate all the ways people might use the Technology, much less test the systems for safety and efficacy for every use case.

‘Bellicose sobbing’

But across the media landscape, conservatives criticized Google for going too far with its efforts to diversify its AI’s outputs. “Google’s woke AI makes Vikings black and the pope a woman,” blared the front page of the New York Post on Feb. 22. Ben Shapiro, the far-right political commentator, wrote that Google had been caught with a “woke agenda.” And on X in particular, the vitriol piled on. Venture capitalists like Joe Lonsdale wrote that Google “is run in large part by racist ideologues and activists.”

Musk, the billionaire owner of X, drove the majority of the conversation on the social network. Musk posted or responded to posts about the conspiracy that Google had a secret vendetta against White people at least 155 times since Feb. 20, according to a Bloomberg review. Across more than a dozen posts, Musk also singled out individual leaders at Google, including Gemini product lead Jack Krawczyk and Google AI ethics advisor Jen Gennai, baselessly asserting that they were behind the company AI’s alleged bias against White people. Musk and other conservative personalities highlighted years-old speeches and posts from the Google workers as questionable evidence that the leaders were the architects of a “woke” AI policy. Musk didn’t respond to a request for comment.

In the days he spent highlighting examples from Gemini, Musk also used the controversy to promote his own generative AI tool, Grok, which he pushed as an antidote to the “woke mind virus” — the far right’s shorthand for corporate diversity goals — afflicting other tech companies. On Feb. 23, Musk said that a senior executive at Google had assured him that the company was “taking immediate action to fix the racial and gender bias in Gemini.”

Meredith Whittaker, the president of the Signal Foundation and a former Google manager, said there was an irony in hearing so many loud voices criticizing Google for failing to ensure fair representation through its products. “We’ve lived through decades of it being the unremarked norm for search results to spin up pages and pages of white women when you typed in ‘beautiful women,’” said Whittaker. “We did not hear the same bellicose sobbing from many internet commenters around that type of inequality.”

Employees within Google are now worried that the social media pile-on will make it even harder for internal teams tasked with mitigating the real-world harms of their AI products, including questions of whether the Technology can hide systemic prejudice. One worker said that the outrage over the AI tool unintentionally sidelining a group that is already overrepresented in most training datasets might lead some at Google to argue for fewer guardrails or protections on the AI’s outputs — which, if taken too far, could be harmful for society.

For now, Google remains in damage control mode. On Feb. 26, Demis Hassabis, head of the research division Google DeepMind, said the company hoped to bring the Gemini feature back online in the “ next couple of weeks.” But over the weekend, conservative personalities continued to pounce on Google, this time zeroing in on Gemini’s text responses to user queries.

“Google’s capitulation will only feed this online mob, not placate it,” said Emerson Brooking, a resident senior fellow at the Atlantic Council who has studied the dynamics of online networked harassment. “The sooner Google learns this, the better.”