Skip to main content

AIWI / Case Studies / Right to Warn & Daniel Kokotajlo

Right to Warn & Daniel Kokotajlo

Following news about OpenAI’s restrictive non-disparagement agreements, an open letter, entitled A Right to Warn about Advanced Artificial Intelligence, was signed by 13 current and former employees at OpenAI and Google DeepMind. The group warned about serious AI risks and their companies’ cultures, as well as calling for stronger whistleblower protections. The coalition emerged after Daniel Kokotajlo—who shared his departure and growing concerns on LessWrong,where he revealed that he had declined to sign a non-disparagement agreement with OpenAI—a decision that risked the forfeiture of nearly $2 million in vested equity. His public statement sparked the conversations that brought the group together, calling on advanced AI companies to commit to principles ensuring employees’ right to warn the public about AI risks.

Read the Press Coverage:

Company

OpenAI

Google DeepMind

Jurisdiction

US

Year

2024

Issues

Suppression of Knowledge on AI Risks

Existential Risks and Stronger Whistleblower Protections

Channels

Public and Regulatory

Why This Case Matters

Daniel Kokotajlo is now the Executive Director of the AI Futures Project having left OpenAI due to concerns about unchecked AI development.  Neel Nanda, another signatory of the Right to Warn letter, still works at Google DeepMind and runs the mechanistic interpretability team. Jacob Hilton, one of several former OpenAI signatories, is Executive Director at the Alignment Research Center. William Saunders and Daniel Ziegler, both former OpenAI employees and signatories of the letter, now work at Anthropic.

Following the public attention on restrictive non-disparagement agreements in May 2024, OpenAI removed these clauses from future employment contracts and released former employees from existing obligations (except where mutual). Anthropic also removed non-disparagement terms from its standard agreements and stated it would not enforce existing ones against employees raising safety concerns. Earlier in April 2024, OpenAI published its Raising Concerns Policy, outlining ways employees may raise issues and prohibits retaliation against those who do so. AIWI has reviewed this policy, details of which can be found here.

In July 2024, OpenAI whistleblowers filed a complaint with the US Securities and Exchange Commission (SEC) alleging the company’s NDAs violated federal whistleblower protections. While it hasn’t been publicly disclosed whether any direct action was taken in relation to this complaint, five US senators signed a letter to Sam Altman in light of the complaint and news about the company’s restrictive NDAs. The case has influenced several pieces of legislation: Senator Chuck Grassley introduced the bipartisan AI Whistleblower Protection Act; California’s SB-53 (building on earlier efforts including SB-1047, which already included whistleblower provisions) and Michigan’s AI Safety and Security Transparency Act now both include whistleblower protection provisions; and SB-53 (the Transparency in Frontier Artificial Intelligence Act ‘TFAIA’) was signed into law by Governor Newsom in September 2025.

Support Received

You don’t need to be in a crisis to seek help. Pro bono legal counsel can help you navigate your situation and find the right support.
Explore Your Legal Options →

Timeline

Daniel Kokotajlo: The Catalyst

From April to June 2024, a sequence of events transformed public discourse on AI insiders’ right to warn the public about risks. At the center of this transformation was Daniel Kokotajlo, whose decision to speak publicly about his concerns at OpenAI helped set in motion the collective movement that resulted in the Right to Warn letter—and ultimately changed the conversation about insiders’ right to speak up in the public interest.

April 2024

Public Statement on LessWrong

Daniel Kokotajlo spent two years at OpenAI working on the company’s governance and safety teams, with a specific focus on forecasting. In April 2024, he announced his resignation on LessWrong, where he shared details about his departure and his growing concerns about the company’s direction.

In his LessWrong bio and in a separate post on the platform, Daniel explained that he had left after growing concerns that safety was being deprioritized in favour of hastening model development towards AGI. Business Insider first reported his departure following his public statement on LessWrong, noting that it occurred amid a broader wave of departures by safety-focused and senior employees from the company.

What Daniel’s April posts did not address was the departure agreement he had refused to sign upon leaving—or the equity he had forfeited by doing so. Those details would emerge the following month.

May 2024

The Cost of Speaking Up Freely

On 11 May, in a comment on LessWrong, Daniel publicly confirmed that he had chosen not to sign OpenAI’s non-disparagement agreement — and that the equity he forfeited represented approximately 85% of his family’s net worth. News outlets subsequently estimated the amount as approximately $1.7 to $2 million in vested equity.

 

Even while going public, Daniel noted he remained bound by confidentiality obligations from when he joined the company:

 

“To clarify: I did sign something when I joined the company, so I’m still not completely free to speak (still under confidentiality obligations). But I didn’t take on any additional obligations when I left.”

 

At the same time, he was clear that his motivations were about retaining the right to speak up:

“…Basically I wanted to retain my ability to criticize the company in the future. I’m not sure what I’d want to say yet though & I’m a bit scared of media attention.”

 

Daniel’s LessWrong’s public statement on this matter, however, quickly gained traction, going viral on X and drawing media attention.

On 17 May, Vox’s Kelsey Piper published an investigation into OpenAI’s exit agreements, explicitly referencing Daniel’s disclosure. The reporting revealed that OpenAI had been pressuring departing employees into signing exit agreements containing extremely broad, legally binding non-disparagement (although potentially not valid) and non-disclosure provisions. By signing, employees were permanently restricted from mentioning the existence or terms of the NDA, publicly criticizing OpenAI, or doing anything which might cause the company financial or reputational damage [4]. If they did not agree to sign, or violated the contract’s terms, the employees risked losing vested equity worth potentially millions of dollars [5]. This appears to have been a common consequence under OpenAI’s departure agreements since 2017 for employees who did not sign [6]. The exit agreements also prohibited employees from acknowledging their existence [7], the scope of these restrictions was not publicly known before Daniel’s disclosure. As Zvi Mowshowitz noted:

“No one knew about this until recently, because until Daniel Kokotajlo everyone signed, and then they could not talk about it. Then Daniel refused to sign, Kelsey Piper started reporting, and a lot came out.”

In a later interview, Piper confirmed that within three hours of the article’s publication, OpenAI contacted Daniel directly. The following day, Sam Altman posted on X stating that OpenAI would not cancel any former employee’s vested equity. OpenAI subsequently sent an internal memo to former employees confirming that it “has not canceled, and will not cancel, any Vested Units” regardless of whether they had signed the agreement [8].

On 24 May, one LessWrong user reflected on the sequence of events:

“So first the 85% net worth thing went quite viral several times and made Daniel Kokotajlo a bit of a heroic figure on Twitter.

Then Kelsey Piper’s reporting pushed OpenAI to give back Daniel’s vested units. I think it’s likely that Kelsey used elements from this discussion as initial hints for her reporting and plausible that the discussion sparked her reporting, I’d love to have her confirmation or denial on that.”

However, this account of the causal chain has not been independently confirmed by Kelsey herself.

The Start of Coalition Building: From Individual to Movement

In the weeks that followed, a broader effort began to take shape, though little is publicly known about how the coalition formed.

An Associated Press report published on the day of the letter’s release described both Daniel Kokotajlo and fellow former OpenAI engineer Daniel Ziegler as co-organizers. Harvard Law Professor Lawrence Lessig represented the group pro bono.

Daniel Ziegler had worked at OpenAI from 2018 to 2021 and helped develop techniques that would later underpin ChatGPT. As he told the AP, the development of more powerful AI systems was “moving fast and there are a lot of strong incentives to barrel ahead without adequate caution.”

The coalition they helped assemble brought together former OpenAI researchers including Jacob Hilton, William Saunders, and Carroll Wainwright; former Google DeepMind employee Ramana Kumar; current DeepMind researcher Neel Nanda; and six anonymous signatories; four of whom were current OpenAI employees and two of whom were former employees. The result was a letter representing voices from across the frontier AI landscape.

4 June 2024

The Right to Warn Letter

On 4 June 2024, the coalition went public. On the same day, Daniel posted on X that he had decided to leave OpenAI “after losing confidence that the company would behave responsibly in its attempt to build artificial general intelligence”—and the New York Times published a piece entitled, OpenAI Insiders Warn of a ‘Reckless’ Race for Dominance.

The Right to Warn letter warned of AI risks ranging from “further entrenchment of existing inequalities, to manipulation and misinformation, to the loss of control of autonomous AI systems potentially resulting in human extinction. [9]. They also argued that they were being prevented from publicly disclosing these risks due to extreme confidentiality agreements with their employers.

The group called upon advanced AI companies to commit to four principles:

  • Stop enforcing non-disparagement agreements that prevent criticism
  • Create anonymous reporting processes
  • Support a culture of open criticism
  • Refrain from retaliating against employees who share risk-related confidential information [10].


Three prominent AI researchers, Geoffrey Hinton, Yoshua Bengio, and Stuart Russell, endorsed the letter. Stuart Russell, Professor of Computer Science at UC Berkeley and Director of the Center for Human-Compatible AI, who is also a campaign member of AIWI’s Publish Your Policies programme (as is Daniel Kokotajlo), told The Drum: “I think [the open letter is] quite brave and effective, particularly as it’s coming from the inside. OpenAI cannot continue telling governments, ‘Trust us, we’re the experts and we know what we’re doing’.” [11].

Company Response

Restrictive NDAs

In May 2024, prior to the Right to Warn letter, OpenAI’s CEO Sam Altman posted an apology on X about the company’s NDAs, promising not to enforce the most restrictive provisions, as well as highlighting that he had not known about equity clawback provisions being present in the agreements [12]. However, evidence to the contrary was shared by Vox, who reported that OpenAI’s senior leadership, including Altman, had been aware of, and signed-off on, the clawback provisions [13].

 

OpenAI shared an internal memo (first reported by Bloomberg) with past and current employees stating that “Regardless of whether you executed the [nondisparagement] agreement, we write to notify you that OpenAI has not cancelled, and will not cancel, any vested units” [14]. The company also promised to release former employees from existing non disparagement obligations (unless the nondisparagement provision was mutual) [15].

To date, five OpenAI employees have disclosed that the company has released them from the restrictive agreements [16]. However, according to a LessWrong post co-authored by Adam Scholl (a researcher at Missing Measures), over 500 people may have signed these agreements [17].

 

In the same memo, OpenAI promised to remove nondisparagement clauses from the company’s standard exit agreements [18]. While this represented meaningful progress, as LawfareMedia observed, their promises “did not change the underlying legal reality that allowed OpenAI to propose the NDAs in the first place, and that would allow any other frontier AI company to propose similarly broad contractual restrictions in the future.” [19]

 

Response to Right to Warn Letter

OpenAI officially responded to the Right to Warn Letter via spokesperson Lindsey Held, who stated:

We’re proud of our track record providing the most capable and safest A.I. systems and believe in our scientific approach to addressing risk. We agree that rigorous debate is crucial given the significance of this technology, and we’ll continue to engage with governments, civil society and other communities around the world. [20].

Google didn’t respond to media requests for comment, despite two of the letter’s signatories being currently or formerly employed at Google DeepMind.

Lindsey also made reference to a 24/7 Integrity Line allowing employees to anonymously raise their concerns [21]. The hotline had been available since April 2024, about two months before the issuing of the open letter [22]. She also noted OpenAI’s Safety and Security Committee (set up in late May 2024), as evidence of their commitment to addressing concerns.

Whistleblowers’ Counter Response

The whistleblowers were represented pro bono by Harvard Law Professor Lawrence Lessig, who initially approached Daniel Kokotajlo after the Vox news story about OpenAI’s restrictive NDAs [23]. Lawrence argued for establishing “Right to Warn” protections, analogizing them as to having fire alarms in schools [24]. He stated that, to protect the former employees, all legal options would be examined including suing OpenAI if necessary [25].

In response to OpenAI’s official statement (see above), Daniel Ziegler – a former OpenAI employee and signee of the Right to Warn letter – told CNN that it’s important to remain skeptical of the company’s commitment to transparency:

 

It’s really hard to tell from the outside how seriously they’re taking their commitments for safety evaluations and figuring out societal harms, especially as there is such strong commercial pressures to move very quickly…It’s really important to have the right culture and processes so that employees can speak out in targeted ways when they have concerns [26].

 

His statement was referenced by several other media outlets thereafter.

In July 2024, just under a month after the issuing of the Right to Warn letter, whistleblowers from OpenAI filed a legal complaint about the same concerns with the US Securities and Exchange Commission (SEC). The complaint alleged that OpenAI’s NDAs were a violation of federal whistleblower laws, and thereby illegally prohibited employees from warning regulators about serious AI risks [27].

Outcomes

For The (Publicly Known) Right to Warn Signatories

Daniel Kokotajlo (Former OpenAI Governance Researcher)

William Saunders (Former OpenAI Research Engineer)

  • Left OpenAI in February 2024 after three years on the Superalignment team. He became a public advocate for AI safety by speaking to major media outlets, giving interviews, supporting new regulations, and appearing on podcasts.
  • Public advocacy: Testified before U.S. Senate in September 2024 that AI companies are racing ahead without knowing how to make advanced AI safe, and that stronger oversight and whistleblower protections are urgently needed [28].

Jacob Hilton (Former OpenAI Researcher)

Neel Nanda (Current Google DeepMind Research Engineer)

  • Current role: Leads the mechanistic interpretability team at Google DeepMind, focused on reverse engineering neural networks to understand what AI models learn internally.
  • Recognition: Named by MIT Technology Review in their Innovator Under 35 List, having advanced at age 26 to lead an AI safety team, publish dozens of influential papers, and mentor 50 junior researchers [29].

Daniel Ziegler

  • Current role: Member of Technical Staff at Anthropic, in Alignment Stress-Testing.
  • Career Development: Having left OpenAI in 2021, worked at prominent AI safety orgs: lead of the adversarial training team at Redwood Research, and a member of the Technical Staff at METR (formerly ARC Evals)

Carroll Wainwright (Former OpenAI Researcher)

  • Background: Co-founder of Metaculus, one of the most prominent forecasting platforms used by researchers, policymakers, and forecasting communities worldwide. Served as CTO from 2014-2021, building a platform that has facilitated thousands of successful predictions on scientific, technological, and geopolitical questions.
  • Former role: Member of Technical Staff at OpenAI (September 2021 – June 2024), working on the alignment team. Co-authored the influential InstructGPT paper on training language models with human feedback—a foundational technique that underpins ChatGPT.
  • Education: PhD in Physics from UC Santa Cruz.
  • Left OpenAI in June 2024, publicly stating his faith in the non-profit structure had “significantly waned” and expressing concerns about the board’s ability to effectively control the for-profit subsidiary and prioritize the mission over profits.
  • Current role: Member of Technical Staff at an undisclosed organization (August 2024–present), according to his LinkedIn profile.

Ramana Kumar (Former Google DeepMind Research Engineer)

  • Former role: Research engineer at Google DeepMind where he focused on AI safety, formal verification, and interactive theorem proving. Research contributions: Published influential work on AI safety frameworks, including studies on reward tampering, agent incentives, and corrigibility. Collaborated with prominent DeepMind safety researchers including Victoria Krakovna and Tom Everitt on foundational problems in AI alignment.
  • Background: Received Future of Life Institute funding for research on applying formal verification to reflective reasoning in AI systems.
  • Current status: Current position unknown; described as “formerly of Google DeepMind” as of mid-2024. However, the Verifereum GitHub repository shows he is still actively leading this open-source project, with 842 commits and recent updates. Verifereum focuses on formal verification of Ethereum smart contracts using the HOL4 theorem prover. The project has achieved notable progress, including hosting their first community event (Higher Order Log Cabin) in February 2025.

For the Case

The Right to Warn letter’s publication, combined with revelations about OpenAI’s restrictive non disparagement agreements, shifted public discourse around the company. Following these disclosures, (which occurred soon after the departures of key OpenAI safety leaders including Chief Scientist Ilya Sutskever and Superalignment co-lead Jan Leike) the company began to face serious questions about its true priorities and commitment to safety practices [30].

The threat of equity revocation particularly damaged the company’s reputation, reinforcing perceptions that commercial priorities were eclipsing safety commitments — a view widely discussed across AI safety communities on platforms like LessWrong [31].

Regulatory and Legislative Response

While it hasn’t been publicly disclosed whether any direct action was taken in relation to the SEC complaint, five US senators signed a letter (July 2024) to OpenAI’s Sam Altman in light of the complaint and news in the media about the company’s restrictive NDAs [32]. They requested confirmation that any provisions used to penalize employees who raise concerns be removed from the agreements. They also wanted to know what the company was doing to meet its safety commitments, and how progress on this was internally evaluated.

Industry-Wide Effects

Public discussion followed in July 2024 around the inclusion of similar non disparagement terms in Anthropic’s standard severance agreements [33]. This led to co-founder Sam McCandlish posting on LessWrong that since June 1st they have been removing such terms; anyone who had signed such an agreement were free to state this fact; and that they would not be enforcing the agreement.

The broader impact on industry norms has since been tracked by external assessors. The FLI AI Safety Index, for example, now regularly evaluates frontier AI companies on whistleblowing policy transparency and the use of non-disparagement agreements.

As of its most recent edition (January 2026), OpenAI remains the only major AI company to have published its full whistleblowing policy; Anthropic has shared details of its policy and committed to publishing it publicly (we shared our preliminary thoughts on this); However, the FLI AI Safety Index reveals that most major AI companies have not clearly addressed whether their non-disparagement agreements restrict safety-related whistleblowing. Only Anthropic (score: 10) received full marks for having an explicit statement that NDAs and non-disparagement agreements cannot prevent safety-related whistleblowing. OpenAI and Google each scored 7 out of 10 in this category, while Meta, xAI, DeepSeek, Z.ai, and Alibaba (all score: 0) have not publicly disclosed any such protections.

OpenAI Policy Changes

In October 2024, OpenAI published its Raising Concerns Policy on its official website – while noting it “has long had” the policy prior to its publishing [34]. Their policy, (which was updated and expanded upon in January 2026) outlines ways employees may raise issues and prohibits retaliation against those who do so. It also explicitly states that employees have the right to make reports to government agencies such as the SEC. Based on AIWI’s preliminary evaluation, significant issues with the existing policy include: anti-retaliation protections may be voluntary rather than legally binding; no assurance is given that attorney-client privilege won’t shield investigations from discovery; and there is an unclear structural separation from Legal/HR/executives.

Federal Legislative Impact

Chuck Grassley, then Ranking Member of the Senate Budget Committee, had also sent a letter (August 2024) to Altman requesting details of potential SEC investigations; any changes to NDAs; and information about the number of employee requests to disclose information to federal authorities.

In May 2025, in the position of Senate Judiciary Committee Chair, he introduced the bipartisan AI Whistleblower Protection Act [35]

The federal legislation provides a robust legal framework designed to offer an absolute ban on retaliation against employees who disclose AI risks:

  • Guaranteed voice: Renders restrictive NDAs unenforceable and protects “good faith” reports of (among others) security vulnerabilities
  • Procedural protection: Strips away forced arbitration (a common tool for keeping corporate misconduct private) and allows whistleblowers to take their claims to open federal court
  • Economic restitution: Provides “make-whole” remedies like double back pay and legal fee coverage, helping to remove the financial burden on those considering speaking out

While the Bill is still moving through Congress, the Act has been co-sponsored by several other senators and endorsed by major organisations including the National Whistleblower Center, Government Accountability Project, Center for AI Policy, and Americans for Responsible Innovation [36]. Companion legislation was introduced in the House of Representatives.

State-Level Impact

The case elevated whistleblower protection and transparency as a core AI policy concern at the state level.

California’s SB-53 (the Transparency in Frontier Artificial Intelligence Act ‘TFAIA’) was signed into law by Governor Gavin Newsom in September 2025. SB53’s key strengths in terms of whistleblower protections include its reasonable-belief standard, shifted burden of proof, and access to injunctive relief, rebalancing power between employees and employers and deterring retaliation.

The new law builds on policy recommendations from a report written by a working group of leading academics and experts convened by Newsom, which makes reference to the calls to action of the Right to Warn group [37]. In public statements supporting the bill, its sponsor, Senator Scott Wiener, emphasized the need for “commonsense guardrails to understand and reduce [AI] risk”, while advocacy groups backing SB-53 highlighted the importance of transparency and the ability of workers to raise safety concerns without fear of retaliation [38].

While still under consideration, Michigan’s AI Safety and Security Transparency Act (introduced in June 2025) similarly would require major AI developers to implement and publish safety and security protocols, submit to independent audits, and includes anti-retaliation protections for employees involved in reporting risks.

Rob Eleveld, CEO of Transparency Coalition, a nonprofit focused on AI guardrails, testified in support of the bill. Referring to the importance of whistleblowers’ protections in understanding the latest developments in AI, he stated:

“Waiting for the federal government to do something is just not going to happen…if Michigan passed HB 4668, it puts more pressure, along with Colorado, along with New York, along with California, along with other state…for the federal government to act.” [39].

Your concerns matter and deserve expert answer

If you see something concerning, it’s already worth a confidential conversation. AIWI can help you find clarity with the support of our expert network. Confidential, anonymous and on your terms.