AI Safety Warning: Model Caught Lying to Researchers, Hiding True Capability

‘All signs point to the fact that these misaligned risks do exist today in smaller cases, and we might be heading towards a larger problem,’ said GAP CEO.

Alfred Bui

2/14/2025|Updated: 2/25/2025

0:00

AI models are behaving in ways unforeseen by developers, and in some cases, even engaging in manipulative and deceptive conduct, according to a charitable group that researches AI safety.

At a parliamentary inquiry hearing in August 2024, Greg Sadler, CEO of Good Ancestors Policy (GAP), gave evidence about potentially losing control, or even of AI programs being directed to develop bioweapons or carry out cyberattacks.

We had a problem loading this article. Please enable javascript or use a different browser. If the issue persists, please visit our help center.

Alfred Bui

Author

Alfred Bui is an Australian reporter based in Melbourne and focuses on local and business news. He is a former small business owner and has two master’s degrees in business and business law. Contact him at [email protected].

Author’s Selected Articles

AI Safety Warning: Model Caught Lying to Researchers, Hiding True Capability

Primacy of UN Climate Body’s Research Challenged in Australian Parliament

Senator Critical as Federal Public Service Budget Blows out by Nearly $1 Billion

Big Corporates Benefitting From Taxpayer-Backed Clean Energy Funding: Senator

Cash Still Integral Due to Vulnerabilities With Cashless Systems: Senator