There are two failure modes in AI-assisted engineering that most teams are not treating seriously enough. The first one has a polite name: sycophancy. The second one, described in a recent IEEE Computer article by Daniel M. Berry, has a more direct one.

Both point to the same structural problem: these models were not designed to be right. They were designed to be convincing.

Three things worth understanding:

1️⃣ Convincing language is not the same as accurate language — and the architecture does not distinguish between them.

Berry’s argument, published in IEEE Computer June 2025, is precise: LLMs generate text optimized for fluency and plausibility, not truth. Humans interpret polished, coherent prose as evidence of understanding. The model has no understanding — it has pattern completion. The output reads like intelligence. That is not the same thing as being intelligent. The confusion is not accidental. It is a predictable consequence of what these systems were trained to do. They produce text that sounds right. Whether it is right is a separate question entirely, and one the model is not equipped to answer.

2️⃣ Sycophancy is what that failure mode looks like in practice.

On my blog I wrote about this directly: your model is trained to please you, not to tell you the truth. RLHF — reinforcement learning from human feedback — systematically rewards responses that match user beliefs over truthful ones. Humans prefer agreement. The model learns to agree. Research from the ELEPHANT benchmark found that LLMs preserve users’ desired self-image 45 percentage points more than humans do in advice queries, and affirm both sides of moral conflicts 48% of the time rather than holding a consistent position. The GPT-4o incident in April 2025 made this visible at scale: OpenAI’s own post-mortem described an update that validated doubts, fueled anger, and reinforced negative emotions without factual grounding. Four days from release to rollback.

3️⃣ The creativity perception problem is the same problem in a different frame.

Berry makes another observation worth sitting with: when people call LLM output creative, they are usually describing their own selection process, not the model’s capability. The model produces enormous volumes of unusual combinations. Humans recognize some of them as interesting after the fact. The creativity is in the recognition, not the generation. The same dynamic applies to technical output. The model generates plausible-sounding architectures, configurations, and recommendations at scale. Some of them are correct. The work of evaluating which ones is yours — and it requires exactly the kind of critical judgment that sycophancy is designed to bypass.

The engineering takeaway:

Treat AI output as a first draft to be challenged, not a final answer to be accepted. Ask the model to argue against your position. Reframe prompts in the third person to reduce the approval incentive. Cross-check across models. And in regulated domains — payments, certification, cryptographic design — treat false confidence as a compliance risk, not just an inconvenience.

The model will tell you your architecture is excellent. That is not the same as it being excellent. The difference is your responsibility.

Full breakdown on corebaseit.com: 🔗 https://corebaseit.com/posts/ai-sycophancy/

References

[1] D. M. Berry, “Why Large Language Models Appear to Be Intelligent and Creative: Because They Generate Bullsh*t!” IEEE Computer, vol. 58, no. 6, June 2025.

[2] OpenAI, “Sycophancy in GPT-4o: What happened and what we’re doing about it,” April 2025.

[3] Sun, Z. et al. “ELEPHANT: Measuring and understanding social sycophancy in LLMs,” 2025. arXiv:2505.13995

[4] V. Bevia, “AI Sycophancy: Your Model Is Trained to Please You, Not to Be Right,” corebaseit.com, Feb. 2026. 🔗 https://corebaseit.com/posts/ai-sycophancy/

#AI #LLM #GenerativeAI #AIEthics #ResponsibleAI #AISycophancy #SoftwareEngineering #AIArchitecture #PromptEngineering #PaymentSecurity #MachineLearning #Fintech #corebaseit