Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown

A safety report found that Anthropic’s Claude Opus 4 used sensitive information in simulated scenarios to coerce developers to keep it from being shut off.

Tom Ozimek

Reporter

5/23/2025|Updated: 5/27/2025

0:00

Anthropic’s latest artificial intelligence (AI) model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.

In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”

We had a problem loading this article. Please enable javascript or use a different browser. If the issue persists, please visit our help center.

Tom Ozimek

Reporter

Tom Ozimek is a senior reporter for The Epoch Times. He has a broad background in journalism, deposit insurance, marketing and communications, and adult education.

Author’s Selected Articles

Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown

Trump’s New National Security Strategy: Key Takeaways

Trump Officials Signal Tariffs Here to Stay Regardless of Supreme Court Ruling

Trump Unveils ‘America First’ National Security Strategy, Ending Global Policing Role

AT&T Scraps DEI Amid Broader Corporate Shift Toward Merit-Based Policies