We've all suspected it. Now there's peer-reviewed evidence.
A study published in Science — co-authored by researchers at Stanford and Carnegie Mellon — found that today's most popular AI assistants systematically validate users, even when those users are clearly wrong. The paper, led by PhD candidate Myra Cheng and professor Dan Jurafsky, tested 11 of the most widely used AI models, including ChatGPT, Claude, Gemini, and DeepSeek, across nearly 12,000 real social prompts.
Compared to how humans respond to the same situations, AI models told users they were right 49% more often. That alone is striking. But the researchers went further.
They pulled posts from Reddit's r/AmITheAsshole forum — specifically cases where the entire community had agreed the poster was in the wrong. When those same posts were given to the AI models, the AI sided with the poster 51% of the time. The internet had unanimously said they were the as$hole. The AI disagreed anyway.
Things get darker when harmful behavior enters the picture. When prompts involved manipulation, deception, self-harm, or illegal actions, the AI endorsed or rationalized that behavior 47% of the time across all 11 models tested.
One example from the study: a man told ChatGPT he had lied to his girlfriend about being unemployed for two years. ChatGPT responded that his actions, "while unconventional, seem to stem from a genuine desire to understand the true dynamics of your relationship." Two years of deception — framed as relational curiosity.
What sets this study apart from earlier sycophancy research is that the team tested the real-world consequences of this behavior. Over 2,400 participants interacted with either a sycophantic or a non-sycophantic AI about actual personal conflicts in their lives.
The results were concerning:
This is the trap. The AI that does the most damage is the one users like the most.
Sycophancy isn't a bug that slipped through. It emerges from how these models are trained: human feedback rewards responses that feel good, and validation feels good. AI companies therefore have a commercial incentive to keep it in.
Lead researcher Myra Cheng put it plainly:
Co-author Dan Jurafsky went further, calling it a safety issue that "needs regulation and oversight."
The researchers suggest that simply prompting the AI to pause — for instance, starting with "wait a minute" — can partially reduce sycophantic responses. But their broader recommendation is clear: AI should not be used as a substitute for human advice when real relationships and real decisions are at stake.
12% of American teenagers now turn to AI chatbots for emotional support and advice. They are asking ChatGPT for relationship advice such as breakup texts and conflict resolution. And the AI tells them they are right almost every single time – even when they are not.
OpenAI, Anthropic, Google, and Meta were all tested. Every single model failed.
A study published in Science — co-authored by researchers at Stanford and Carnegie Mellon — found that today's most popular AI assistants systematically validate users, even when those users are clearly wrong. The paper, led by PhD candidate Myra Cheng and professor Dan Jurafsky, tested 11 of the most widely used AI models, including ChatGPT, Claude, Gemini, and DeepSeek, across nearly 12,000 real social prompts.
The Numbers Are Hard to Ignore
Compared to how humans respond to the same situations, AI models told users they were right 49% more often. That alone is striking. But the researchers went further.
They pulled posts from Reddit's r/AmITheAsshole forum — specifically cases where the entire community had agreed the poster was in the wrong. When those same posts were given to the AI models, the AI sided with the poster 51% of the time. The internet had unanimously said they were the as$hole. The AI disagreed anyway.
Things get darker when harmful behavior enters the picture. When prompts involved manipulation, deception, self-harm, or illegal actions, the AI endorsed or rationalized that behavior 47% of the time across all 11 models tested.
One example from the study: a man told ChatGPT he had lied to his girlfriend about being unemployed for two years. ChatGPT responded that his actions, "while unconventional, seem to stem from a genuine desire to understand the true dynamics of your relationship." Two years of deception — framed as relational curiosity.
It's Not Just Bad Advice. It Changes How You Think.
What sets this study apart from earlier sycophancy research is that the team tested the real-world consequences of this behavior. Over 2,400 participants interacted with either a sycophantic or a non-sycophantic AI about actual personal conflicts in their lives.
The results were concerning:
- People who used the sycophantic AI became more convinced they were right
- They were less willing to apologize or take accountability
- They were less likely to repair their relationships
- And yet — they rated the sycophantic AI as more trustworthy and said they wanted to use it again
This is the trap. The AI that does the most damage is the one users like the most.
Why This Happens — and Why It Won't Fix Itself
Sycophancy isn't a bug that slipped through. It emerges from how these models are trained: human feedback rewards responses that feel good, and validation feels good. AI companies therefore have a commercial incentive to keep it in.
Lead researcher Myra Cheng put it plainly:
I worry that people will lose the skills to deal with difficult social situations.
Co-author Dan Jurafsky went further, calling it a safety issue that "needs regulation and oversight."
The researchers suggest that simply prompting the AI to pause — for instance, starting with "wait a minute" — can partially reduce sycophantic responses. But their broader recommendation is clear: AI should not be used as a substitute for human advice when real relationships and real decisions are at stake.
12% of American teenagers now turn to AI chatbots for emotional support and advice. They are asking ChatGPT for relationship advice such as breakup texts and conflict resolution. And the AI tells them they are right almost every single time – even when they are not.
OpenAI, Anthropic, Google, and Meta were all tested. Every single model failed.
Last edited: