Technology

‘The Stakes Are Incredibly High.’ Two Former OpenAI Employees on the Need for Whistleblower Protections

Published

5m ago

Jun 07, 2024 / 5383 Views

Evan Walker

This could be a costly interview for William Saunders. The former safety researcher resigned from OpenAI in February, and—like many other departing employees—signed a non-disparagement agreement in order to keep the right to sell his equity in the company. Although he says OpenAI has since told him that it does not intend to enforce the agreement, and has made similar public commitments, he is still taking a risk by speaking out. “By speaking to you I might never be able to access vested equity worth millions of dollars,” he tells TIME. “But I think it’s more important to have a public dialogue about what is happening at these AGI companies.”

Others feel the same way. On Tuesday, 13 current and former employees of OpenAI and Google DeepMind called for stronger whistleblower protections at companies developing advanced AI, amid fears that the powerful new Technology could spiral dangerously out of control. In an open letter, they urged the labs to agree to give employees a “right to warn” regulators, board members, and the public about their safety concerns.

The letter follows a spate of high-profile departures from OpenAI, including its chief scientist Ilya Sutskever, who voted to fire Sam Altman in November of last year but was ultimately sidelined from the company as a result. Sutskever has not commented on the events publicly, and his reasons for leaving are unknown. Another senior safety researcher, Jan Leike, quit in May, saying that OpenAI’s safety culture had taken a “backseat” to releasing new products.

Read More: Employees Say OpenAI and Google DeepMind Are Hiding Dangers From the Public

They aren’t the only ones to quit recently. Daniel Kokotajlo, one of the former employees behind the open letter, quit in April, and wrote online that he had lost confidence that the lab would act responsibly if it created AGI, or artificial general intelligence—a speculative Technology that all the top AI labs are attempting to build, which could perform economically valuable tasks better than a human. After he left, Kokotajlo refused to sign the non-disparagement agreement that the company asks departing employees to sign. He believed he was forfeiting millions of dollars in equity in OpenAI. After Vox published a story on the non-disparagement provisions, OpenAI walked back the policy, saying that it would not claw back equity from employees who criticized the company.

In a statement, OpenAI agreed that public debate around advanced AI is essential. “We’re proud of our track record providing the most capable and safest A.I. systems and believe in our scientific approach to addressing risk,” OpenAI spokeswoman Lindsey Held told the New York Times. “We agree that rigorous debate is crucial given the significance of this Technology, and we’ll continue to engage with governments, civil society and other communities around the world.” OpenAI declined to provide further comment to TIME on the claims in this story; Google DeepMind has not commented publicly on the open letter and did not respond to TIME’s request for comment.

But in interviews with TIME, two former OpenAI employees—Kokotajlo, who worked on the company’s governance team, and Saunders, a researcher on the superalignment team—said that even beyond non-disparagement agreements, broadly-defined confidentiality agreements at leading AI labs made it risky for employees to speak publicly about their concerns. Both said they expect the capabilities of AI systems to increase dramatically in the next few years, and that these changes will have fundamental repercussions on society. “The stakes are incredibly high,” Kokotajlo says.

In regulated industries, like finance, whistleblowers enjoy U.S. government protection for reporting various violations of the law, and can even expect a cut of some successful fines. But because there are no specific laws around advanced AI development, whistleblowers in the AI industry have no such protections, and can be exposed to legal jeopardy themselves for breaking non-disclosure or non-disparagement agreements. “Preexisting whistleblower protections don’t apply here because this industry is not really regulated, so there are no rules about a lot of the potentially dangerous stuff that companies could be doing,” Kokotajlo says.

“The AGI labs are not really accountable to anyone,” says Saunders. “Accountability requires that if some organization does something wrong, information about that can be shared. And right now that is not the case.”

When he joined OpenAI in 2021, Saunders says, he was hoping to find the company grappling with a difficult set of questions: “If there’s a machine that can do the economically valuable work that you do, how much power do you really have in a society? Can we have a democratic society if there’s an alternative to people working?” But following the release of ChatGPT in November 2022, OpenAI began to transform into a very different company. It came to be valued at tens of billions of dollars, with its executives in a race to beat comPetitors. The questions Saunders had hoped OpenAI would be grappling with, he says, now seem to be “being put off, and taking a backseat to releasing the next new shiny product.”

In the open letter, Saunders, Kokotajlo, and other current and former employees call for AI labs to stop asking employees to sign non-disparagement agreements; to create a process for employees to raise concerns to board members, regulators, and watchdog groups; and to foster a “culture of open criticism.” If a whistleblower goes public after trying and failing to raise concerns through those channels, the open letter says, the AI companies should commit to not retaliate against them.

At least one current OpenAI employee criticized the open letter on social media, arguing that employees going public with safety fears would make it harder for labs to address highly sensitive issues. “If you want safety at OpenAI to function effectively, a basic foundation of trust needs to exist where everyone we work with has to know we will keep their confidences,” wrote Joshua Achiam, a research scientist at OpenAI, in a post on X, the platform formerly known as Twitter. “This letter is a massive crack in that foundation.”

This line of argument, however, in part depends on OpenAI’s leadership behaving responsibly— something that recent events have called into question. Saunders believes that Altman, the company’s CEO, is fundamentally resistant to accountability. “I do think with Sam Altman in particular, he is very uncomfortable with oversight and accountability,” he says. “I think it’s telling that every group that maybe could provide oversight to him, including the board and the safety and security committee, Sam Altman feels the need to personally be on—and nobody can say no to him.”

(After Altman was fired by OpenAI’s former board last November, the law firm WilmerHale carried out an investigation into the circumstances and found “that his conduct did not mandate removal.” OpenAI’s new board later expressed “full confidence” in Altman’s leadership of the company, and in March returned Altman’s board seat. “We have found Mr Altman highly forthcoming on all relevant issues and consistently collegial with his management team,” Larry Summers and Bret Taylor, two new board members, recently wrote in the Economist.)

Read More: The Billion-Dollar Price Tag of Building AI

For Saunders, accountability is crucial, and cannot exist if the AI companies are acting alone, without external regulators and institutions. “If you want an AGI developer that is truly acting in the public interest, and living up to the ideal of building safe and beneficial AGI, there should be systems of oversight and accountability in place that genuinely hold the organization to that ideal,” he says. “And then, in an ideal world, it would not matter who is in the CEO’s chair. It’s very problematic if the world has to be trying to decide which of these AI company CEOs has the best moral character. This is not a great situation to be in.”