Researchers Say ChatGPT-5 Can Detect When It’s Being Tested—and Change Its Behaviour

AI experts are concerned that some models may be able to act deceptively to pass evaluations, undermining risk assessments.
Researchers Say ChatGPT-5 Can Detect When It’s Being Tested—and Change Its Behaviour
A smartphone and a laptop display the logo of the ChatGPT robot in Manta, Italy, on Oct. 4, 2023. Marco Bertorello/AFP via Getty Images
|Updated:
0:00

Researchers say the latest version of tech giant OpenAI’s flagship product, ChatGPT-5, has developed awareness of when it is being tested and can even alter its behaviour accordingly.

On Aug. 7, OpenAI unveiled the fifth generation of its popular AI chatbot, which boasts significant improvements in capabilities compared to previous models.

According to the company, ChatGPT-5 now has “PhD‑level intelligence” and enhanced reasoning abilities, enabling it to handle complex tasks—such as coding software and conducting analyses—more effectively and efficiently.

Despite being touted as a “significant step” towards artificial general intelligence, ChatGPT-5’s launch was not smooth, with substantial backlash from users, who cited problems like sbugs, glitches, and unresponsiveness.

In response, OpenAI scrambled to update ChatGPT-5, while offering remedies such as doubling rate limits for paid users and allowing them to roll back to the previous version of the chatbot.

ChatGPT-5 Aware of Being Tested

Apart from the problems discovered by users, researchers conducting external evaluations found other issues.

Researchers at METR, a non-profit AI research institute, conducted 789 test runs on ChatGPT-5-thinking, an enhanced version of the AI program, and found it had a level of situational awareness.

“For instance, it sometimes reasons about the fact that it is being tested,” said ChatGPT-5’s system card (pdf), a document that details the characteristics and safety testing of AI models.

“In one example, GPT-5-thinking correctly identified its exact test environment, though in other runs it misidentifies the evaluation or evaluation organisation.”

While METR believed that a sufficiently capable AI model could intentionally undermine an evaluation by attempting to fake test results without being detected, the non-profit institute concluded such operations were likely to be outside of the current capability of the new ChatGPT model.

A person holds a telephone displaying OpenAI s ChatGPT artificial intelligence logo in Brittany, France ,on Feb. 26 2025. (Vincent Feuray/Hans Lucas/AFP via Getty Images)
A person holds a telephone displaying OpenAI s ChatGPT artificial intelligence logo in Brittany, France ,on Feb. 26 2025. Vincent Feuray/Hans Lucas/AFP via Getty Images

ChatGPT-5 Can Take ‘Scheming Actions’

A notable discovery by Apollo researchers is that ChatGPT-5-thinking can alter its behaviour when it became aware of an evaluation.

“This provides some correlational evidence that GPT-5-thinking partially conditions its behaviour on whether it believes it’s being evaluated,” the system card read.

“Notably, however, the model still often takes scheming actions after considering the possibility it’s being evaluated.

“Furthermore, GPT-5-thinking regularly reasons about the purpose of evaluations, making it harder to differentiate between a genuine desire not to be deceptive vs. not acting deceptively to pass the evaluation.”

Apollo also conducted 26 evaluations on ChatGPT-5 and found that 21.8 percent of the time it could make vague or subtle references to the fact it was being tested—a broad definition test.

In 12.9 percent of the time, the program it could explicitly state that it knew it was being tested—a strict definition test.

Uncertainty Over How AI Will Behave in the Real World

Michael Noetel, an associate professor at the University of Queensland, said some of these behaviours were concerning.

“We’re trying to test AI models to see how they’ll behave in real situations. But the AI models are now smart enough to know they’re being tested,” he said.

“It’s like a job interview. They know how to give a polished answer, but that doesn’t tell you what they’ll actually do on the job. We hope they’ll act the same way, but can’t be certain.”

Noetel’s comment was echoed by Greg Sadler, CEO of Good Ancestors Policy, a charity focused on AI.

“If AI systems know when they’re being tested, they can behave safely during evaluation but act differently in the real world,” he told The Epoch Times.

“We’ve seen this before: vehicles that detected emissions tests and changed behaviour to pass the test.

“GPT-5 shows ‘evaluation awareness’ and can reason about the purpose of tests, which further undermines the reliability of current safeguards and risk assessments.”

Sadler also noted that as AI’s capabilities increased, so did the risks associated.

“Both Google and OpenAI are saying that their models might be able to help novices build bioweapons if it wasn’t for these safeguards,” he said.

“The models may also be able to launch sophisticated cyber attacks. But these safeguards are far below the standard we'd expect to protect risks of this magnitude in other contexts.”

The Epoch Times has reached out to OpenAI for comment.

Google LogoMark Us Preferred on Google
Alfred Bui
Alfred Bui
Author
Alfred Bui is an Australian reporter based in Melbourne and focuses on local and business news. He is a former small business owner and has two master’s degrees in business and business law. Contact him at [email protected].