Why AI Chatbots Agree with You Even When You're Wrong

Kave0ne · 2026-03-11T13:13:44 1773234824

[flagged]

LuxBennu · 2026-03-11T13:16:13 1773234973

Memory helps, but sycophancy exists even in single-turn interactions — the Anthropic 2023 paper showed pretrained models cave to mild pushback like "I think the answer is X but I'm not sure" with zero conversation history. In our LLM eval pipelines, we see the same thing: models accept false presuppositions embedded in a single prompt without any prior context to fall back on. The deeper issue is that RLHF rewards agreeableness because human raters genuinely prefer it. Better memory architecture would help with multi-turn drift, but the single-turn sycophancy is baked into the training signal itself.