Existential Safety Is AI Industry’s Core Weakness, Study Warns

‘By really shedding light on what companies are doing, we give them an incentive to do better,’ the Future of Life Institute’s president said.

Mark Us Preferred on Google

Existential Safety Is AI Industry’s Core Weakness, Study Warns — Participants chat in front of an electronic image of a soldier before the closing session of the Responsible AI in the Military Domain (REAIM) summit in Seoul, South Korea, on Sept. 10, 2024. Jung Yeon-je/AFP via Getty Images

Victoria Friedman

12/5/2025|Updated: 12/5/2025

0:00

Eight major artificial intelligence (AI) developers are failing to plan for how they would manage extreme risks posed by future AI models that match or surpass human capabilities, according to a study by the Future of Life Institute (FLI) published Dec. 3.

FLI’s Winter 2025 AI Safety Index assessed U.S. companies Anthropic, OpenAI, Google DeepMind, Meta, and xAI, and Chinese companies Z.ai, DeepSeek, and Alibaba Cloud across six themes, which included current harms, safety frameworks, and existential safety.

The independent panel of experts who conducted the review found that even with the highest-scoring developers, “existential safety remains the industry’s core structural weakness.”

Artificial narrow intelligence, or weak AI, is the current level of AI that exists today, according to IBM.

Tech giants are working toward developing artificial general intelligence (AGI), or strong AI, which IBM defines as AI that can “use previous learnings and skills to accomplish new tasks in a different context without the need for human beings to train the underlying models.”

Artificial superintelligence or Super AI, if realized, “would think, reason, learn, make judgements and possess cognitive abilities that surpass those of human beings,” according to IBM.

“All of the companies reviewed are racing toward AGI/superintelligence without presenting any explicit plans for controlling or aligning such smarter-than-human technology, thus leaving the most consequential risks effectively unaddressed,” FLI said in the report.

All Fail on Existential Safety

The findings of the evaluation were presented in the form of report cards with letter grades from A to F, accompanied by a corresponding numerical grade point average (GPA).

For the existential safety metric, which “examines companies’ preparedness for managing extreme risks from future AI systems that could match or exceed human capabilities, including stated strategies and research for alignment and control,” not one developer scored higher than D.

Anthropic, OpenAI, and Google DeepMind all achieved a D, which, according to FLI, indicates a weak strategy that contains “vague or incomplete plans for alignment and control” or shows “minimal evidence of technical rigor.”

Frame of a video generated by a new artificial intelligence tool, dubbed Sora, unveiled by OpenAI, in Paris on Feb. 16, 2024. (Stefano Rellandini/AFP via Getty Images) — Frame of a video generated by a new artificial intelligence tool, dubbed Sora, unveiled by OpenAI, in Paris on Feb. 16, 2024. Stefano Rellandini/AFP via Getty Images

The remaining five developers scored Fs, meaning they were regarded as having “no credible strategy,” lacking safeguards or increasing their catastrophic-risk exposure.

In a video accompanying the release of the report, FLI President Max Tegmark said all of the AI developers failed on existential safety because referees felt none of them had a plan for how they would actually control AGI and superintelligence.

“I believe that the best disinfectant is sunshine,” Tegmark said. “By really shedding light on what companies are doing, we give them an incentive to do better. We give governments an incentive to regulate them better, and we just really increase the chances that we’re going to have a good future with AI.”

Poor Grades

The report also found a clear divide between the top performers—Anthropic, OpenAI, and Google DeepMind—and the rest, with the most substantial gap existing in risk assessment, safety frameworks, and information sharing.

Even among the top-rated companies, overall grades were low, with Anthropic in the lead with a C+ (GPA score of 2.67), followed by OpenAI (C+/2.31), and Google DeepMind (C/2.08).

For the second group, Elon Musk’s xAI, Mark Zuckerberg’s Meta, Z.ai, and DeepSeek all earned Ds, while Alibaba Cloud was slapped with a D-.

The highest individual grade was a single A- for Anthropic, for information sharing.

Demis Hassabis, co-founder of Google's artificial intelligence startup DeepMind, speaks during a press conference in Seoul, South Korea, on March 15, 2016. (Jeon Heon-Kyun/Getty Images) — Demis Hassabis, co-founder of Google's artificial intelligence startup DeepMind, speaks during a press conference in Seoul, South Korea, on March 15, 2016. Jeon Heon-Kyun/Getty Images

A Google DeepMind spokesperson told The Epoch Times that it takes a “rigorous, science-led approach” to AI safety.

The spokesperson said that its safety framework outlines protocols for “identifying and mitigating severe risks from powerful frontier AI models before they manifest,” adding that DeepMind will continue to keep safety and governance at a pace with innovation.

An OpenAI spokesperson told The Epoch Times the company invests heavily in frontier safety research and builds safeguards for its systems.

“Safety is core to how we build and deploy AI,“ the spokesperson said. ”We continuously strengthen our protections to prepare for future capabilities.”

The Epoch Times contacted xAI, Anthropic, Meta, Z.ai, DeepSeek, and Alibaba Cloud for comment, but the companies did not respond by publication time.

Hacks, Lawsuits

The study comes amid heightened concerns over the impact of AI following reports of AI-induced self-harm, prompting several lawsuits that allege AI models drove users to commit suicide.

Last month, Anthropic said in a blog post that state-sponsored Chinese hackers used its Claude Code tool in what it called the world’s first AI-driven cyberattack that, once deployed, required minimal human involvement.

The company said on Nov. 13 that the hackers had committed attacks against 30 targets worldwide, of which a small number were successful. Anthropic said that AI had conducted 80 to 90 percent of the work during the cyberattacks, “with human intervention required only sporadically.”

Anothropic said that the attack “relied on several features of AI models that did not exist, or were in much more nascent form, just a year ago.”

We had a problem loading this article. Please enable javascript or use a different browser. If the issue persists, please visit our help center.