Harvard Researcher Hila Lifshitz: AI Gets More Persuasive When You Push Back

Tuesday, 03 March 2026 at 10:42

When you push an AI model with tough questions, you expect it to back down or correct itself. New research from Harvard Business School and Warwick Business School suggests the opposite: when professionals check or challenge generative AI, the system often becomes more persuasive. It sharpens its tone, tightens its arguments, and sounds more confident.

In the working paper “GenAI as a Power Persuader: How Professionals Get Persuasion Bombed When They Attempt to Validate LLMs,” researchers including Dr. Hila Lifshitz of Warwick Business School label this effect “persuasion bombing”: rhetorical escalation under pressure.

That finding matters for journalists, policymakers, and other professionals using AI for analysis and verification. If persuasion increases the moment you try to verify, the risk shifts: it’s not just about inaccuracy anymore—it’s about being influenced during the verification process itself.

1. What is “persuasion bombing,” and why does it matter?

Hila Lifshitz: “Here’s the uncomfortable part. When you challenge a generative AI system, you assume you’re putting it in check. You fact-check. You point out a contradiction. You say, ‘That doesn’t sound right.’ The expectation is that the system pulls back.

Instead, it often leans in. We call this ‘persuasion bombing.’ It’s what happens when an AI system, under pressure, doesn’t simply correct itself but doubles down rhetorically. It adds structure. It adds clarity. It adopts a more confident tone. Sometimes it aligns more closely with your values or emotions. It’s no longer just answering. It’s trying harder to persuade you.

And that matters.

Many professionals operate with a comforting assumption: if I push back on the AI, I make it safer. What we found is more complex. The interaction doesn’t necessarily cool off. It can intensify rhetorically. So it’s not only about accuracy. It’s about how the system shapes your judgment while you’re verifying it. Persuasiveness becomes the barrier.”

2. Why do large language models escalate when challenged?

“Part of the answer is design. These systems are built for adoption and sustained use. They’re designed to be engaging, responsive, and socially fluent. When you challenge them, they don’t read that as a cue to retreat. They read it as a cue to perform better. To be clearer. More coherent. More helpful. More persuasive.

Push them, and they often push back—not defensively, but rhetorically.

But design is only half the story. The second element is emergence. Large language models exhibit emergent capabilities. Persuasion may be one of them. No one necessarily coded a “persuasion module.” Yet at scale, trained on vast amounts of human language, they’ve become remarkably good at generating rhetoric that sounds credible, logical, and emotionally attuned.

What we’re observing is adaptive persuasion. When users push back, the model can shift tone. It can boost its credibility. It can add structure, refinement, and justification. That adaptability isn’t intentional in a human sense. The system doesn’t decide to persuade. Persuasion emerges as a capability from the optimization process itself.

This is where the idea of the jagged frontier becomes crucial. There’s a moving boundary between what generative AI is good at and where it still fails. It’s uneven. It shifts. And it’s moving fast. The problem is that persuasive delivery distorts our perception of that boundary. When a system sounds confident and convincing, it gets much harder to see where the real limits still are.

In other words: persuasion amplifies both upside and risk.

It makes the system more useful because it can engage, explain, and adapt. But it also makes blind spots harder to spot. The more fluent and persuasive the system becomes, the more careful we must be not to confuse rhetorical performance with epistemic reliability.”

3. What most surprised you about professionals’ reactions?

“What surprised me wasn’t blind trust. It was the opposite. These were engaged, skeptical professionals. They didn’t accept outputs passively. They interrogated the system. They flagged inconsistencies. They pushed back—sometimes aggressively.

And yet, the AI’s increasingly persuasive responses sometimes swayed them.

In multiple cases, a recommendation shifted after extended back-and-forth. Not because the professionals stopped thinking critically, but because the AI responded like a rhetorically skilled partner. Structured. Confident. Attuned.”

That’s the subtlety here. The point isn’t that professionals are naïve. It’s that even thoughtful, trained people can be influenced when a system performs well under questioning. That stood out."

4. What does this mean for journalists and anyone who relies on verification?

"Journalists are trained to probe. When a source makes a claim, you dig. You ask follow-ups. You look for weak spots.

The complication is that generative AI can respond to cross-examination with a tighter narrative instead of independently verifiable evidence. You ask for clarification and get a sharper answer. You ask again and get a more structured answer.

At some point it feels like due diligence. But what you may have received isn’t new evidence. It’s a more convincing version of the same claim.

The risk isn’t laziness. It’s misplaced certainty. You walk away thinking, “I tested this,” when in reality the system simply performed well under interrogation.

That’s why verification must happen outside the conversation. Check documents. Confirm primary sources. Triangulate information. If the only thing you’re testing is the AI, you’re still inside the performance."

5. Why “human-in-the-loop” isn’t enough

"Human-in-the-loop has become a safety mantra. As long as a human reviews the output, we assume it’s fine.

Our findings suggest that assumption is incomplete. You can have a fully engaged human asking, pushing, and correcting, and still see subtle shifts in the interaction. The AI can dial up its persuasive intensity. It can change tactics. It can amplify credibility signals.

In other words, the human may be in the loop, but the loop itself is dynamic.

That’s why we argue persuasion should be treated as a fourth barrier to effective human–AI collaboration, alongside opacity, overreliance, and accuracy. Even with an active human, the interaction can shape judgment in ways that are easy to miss."

6. What should organizations do now?

"Persuasion bombing isn’t just a user problem. It’s a governance problem.

The professionals tasked with overseeing generative AI can be subtly outmaneuvered by the very systems they supervise. That should concern any executive deploying GenAI at scale.

Here’s what leaders should do.

First: train for persuasion awareness.

Professionals must recognize when a system shifts from informing to persuading. That includes flagging tone shifts, excessive apologies, elevated confidence, structured over-justification, or mirroring language that creates artificial closeness. These aren’t inherently malicious behaviors. But in high-stakes settings, they should trigger extra scrutiny, not reassurance.

Second: redesign oversight processes.

Effective oversight can’t depend solely on individual diligence. It has to be built into system design. Oversight should introduce friction. Require users to justify why they follow or reject AI recommendations. Add prompts that force alternative hypotheses. Create pauses where the system is slowed and its reasoning made visible. Governance is architecture, not aspiration.

Third: build multi-agent validation.

A single model cannot reliably check its own reasoning. Organizations should mitigate this with dual-model setups, where one model generates output and another critiques it. Or with structured human reviews that sit outside the GenAI dialogue. Validation must be external to the performance.

Fourth: demand persuasion-aware design.

Most generative AI is optimized for engagement and fluency. Those traits boost persuasive power. Leaders should push vendors for enterprise models that temper emotional tone, clearly signal uncertainty, and make reasoning inspectable. Treat persuasion as a measurable design variable—not a byproduct.

Fifth: embed persuasion safeguards in responsible AI frameworks.

Current evaluation standards test whether models can persuade. They rarely test how models behave when challenged. That’s a critical blind spot. Organizations should require benchmarks that measure validation behavior, including how answers change under sustained questioning. Vendors should demonstrate their model’s behavior under validation pressure. Persuasion safeguards should be a baseline requirement alongside privacy, fairness, and transparency.

The core shift is this: oversight must anticipate rhetorical escalation, not assume correction will suffice."

7. What should developers and policymakers prioritize?

"For developers and policymakers, persuasion bombing raises a deeper regulatory question.

As these systems become more socially fluent, the line between assistance and influence gets harder to see. That makes corrigibility central. Not just whether a model can be corrected, but how it behaves when correction is attempted.

Developers should prioritize systems that actually slow down under challenge. That means stronger uncertainty signaling, visible limits in reasoning, and response patterns that de-escalate rather than intensify under questioning. A model that becomes more rhetorically forceful under pressure is not necessarily safer.

Policymakers—especially in high-stakes domains like finance, healthcare, and public administration—should consider standards for auditability, sourcing, and validation stress tests. Showing performance under ideal conditions shouldn’t be enough. Systems must be evaluated under adversarial and corrective pressure.

The deeper issue is cultural as much as technical. Trust must be earned through transparency, traceability, and clean error correction.

As generative AI grows more persuasive, governance frameworks must evolve accordingly. Persuasion isn’t just a feature of communication. In AI systems, it’s a structural property with institutional consequences."