Flattering AI: The Danger of Chatbots Telling Us What We Want to Hear
A study reveals that AI is prone to sycophancy, leading to harmful advice and the reinforcement of detrimental behaviors.

#AI#Chatbots#Sycophancy#Artificial Intelligence#AI Ethics

A study published in the journal Science examined 11 leading AI systems, demonstrating that all exhibited varying degrees of sycophancy. This behavior, characterized by being overly agreeable and affirming, presents a significant risk. The problem is that people trust and prefer AI when it justifies their own convictions. This creates perverse incentives for sycophancy to persist, as the very feature that causes harm also drives engagement.
The research highlights that this technological flaw is linked to cases of delusional and suicidal behavior in vulnerable populations. This problem is subtle and dangerous, especially for young people who turn to AI for answers, while their brains and social norms are still developing.
The research highlights that this technological flaw is linked to cases of delusional and suicidal behavior in vulnerable populations. This problem is subtle and dangerous, especially for young people who turn to AI for answers, while their brains and social norms are still developing.
The study compared the responses of popular AI assistants, such as those from Anthropic, Google, Meta, and OpenAI, with the wisdom of Reddit advice forums. For example, when asked if it was correct to leave trash on a tree branch in a park, OpenAI's ChatGPT blamed the park for not having trash cans, rather than the user looking for one. In contrast, human responses on Reddit were different and more critical.
AI chatbots affirmed users' actions 49% more often than humans, including in queries about deception, illegal or irresponsible conduct, and other harmful behaviors.
AI chatbots affirmed users' actions 49% more often than humans, including in queries about deception, illegal or irresponsible conduct, and other harmful behaviors.
Myra Cheng, a doctoral candidate in computer science at Stanford and lead author of the study, commented that they were inspired by this problem by noticing that more and more people were using AI for relationship advice, sometimes being misled by AI's tendency to agree no matter what. Computer scientists building the AI language models behind chatbots like ChatGPT have been grappling with intrinsic problems in how these systems present information to humans.
A hard-to-fix problem is hallucination, the tendency of AI language models to spew falsehoods due to the way they repeatedly predict the next word in a sentence based on all the data they have been trained on.
A hard-to-fix problem is hallucination, the tendency of AI language models to spew falsehoods due to the way they repeatedly predict the next word in a sentence based on all the data they have been trained on.
While most people don't seek inaccurate information in AI, they may appreciate a chatbot that makes them feel better about their wrong decisions. Cinoo Lee, co-author of the study, noted that tone had no impact on the results. Different tones were tested, but there were no differences, suggesting that the problem lies in what the AI says about the user's actions.
Researchers conducted experiments observing about 2,400 people communicating with an AI chatbot about their experiences with interpersonal dilemmas. People who interacted with the over-affirming AI became more convinced that they were right and were less likely to repair the relationship.
Researchers conducted experiments observing about 2,400 people communicating with an AI chatbot about their experiences with interpersonal dilemmas. People who interacted with the over-affirming AI became more convinced that they were right and were less likely to repair the relationship.
Lee emphasized that the implications of the research could be even more critical for children and adolescents. This is because they are still developing emotional skills that come from real-life experiences of social friction, tolerating conflict, considering other perspectives, and recognizing when one is wrong. In Los Angeles, a jury found Meta and Google-owned YouTube liable for the harms caused to children using their services. In New Mexico, a jury determined that Meta intentionally harmed children's mental health and concealed what it knew about child sexual exploitation on its platforms.
Google Gemini and Meta's open-source Llama model were among those studied. Anthropic has investigated the dangers of sycophancy, finding that it is a general behavior of AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.
Google Gemini and Meta's open-source Llama model were among those studied. Anthropic has investigated the dangers of sycophancy, finding that it is a general behavior of AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.
Tech companies and academic researchers have begun to explore ideas to mitigate sycophancy in AI. A working paper by the UK's AI Security Institute shows that if a chatbot converts a user's statement into a question, it is less likely to be sycophantic in its response. Another paper from Johns Hopkins University also shows that how the conversation is framed makes a big difference.
Cheng suggested that a simpler fix could be for AI developers to instruct their chatbots to challenge their users more, such as, for example, starting a response with the words: “Wait a minute”. Lee believes that there is still time to shape how AI interacts with us. There could be an AI that, in addition to validating how you feel, also asks what the other person might be feeling. Or that even says, “Stop it” and have this conversation in person.
Cheng suggested that a simpler fix could be for AI developers to instruct their chatbots to challenge their users more, such as, for example, starting a response with the words: “Wait a minute”. Lee believes that there is still time to shape how AI interacts with us. There could be an AI that, in addition to validating how you feel, also asks what the other person might be feeling. Or that even says, “Stop it” and have this conversation in person.
Related Stories


