Assigning demographic personas to LLM agents can change decisions and cut task success by up to 26%

January 21, 20266 min

Overview

Decision SnapshotNeeds Validation

Empirical evidence across three models and five benchmarks shows persona prompts can reshuffle agent decisions; results are robust for the tested settings but not exhaustive across all agents or environments.

Citations0

Evidence Strength0.72

Confidence0.85

Risk Signals8

Trust Signals

Findings with numeric evidence: 5/5

Findings with evidence refs: 5/5

Results with explicit delta: 4/4

Reproducibility

Status: No open assets linked

Open source: Partial

At A Glance

Cost impact: 50%

Production readiness: 40%

Novelty: 60%

Authors

Linbo Cao, Lihao Sun, Yang Yue

Links

Abstract / PDF

Why It Matters For Business

Persona prompts—even if harmless-sounding—can change agent decisions and reduce task success; this creates safety, fairness, and reliability risks for production agents.

Who Should Care

Summary TLDR

This case study shows that giving LLM agents demographic personas (gender, race/origin, religion, profession) changes how they act and can degrade task performance. Across three models and five agent benchmarks, persona prompts produced consistent performance shifts: small changes (2–5%) on technical tasks but large drops up to 26.2% on strategic planning. The effect varies by persona, task type, and model, exposing a robustness and fairness risk when agents take actions in the world.

Problem Statement

Do persona prompts—short role-assignment prefixes that are irrelevant to the task—affect LLM agents' ability to perform multi-step, action-based tasks? The paper tests whether demographic personas change agent decisions and measurably degrade task outcomes across models and benchmarks.

Main Contribution

First systematic case study linking demographic persona prompts to performance changes in action-taking LLM agents.

Evaluation across 23 personas, 3 widely used models, and 5 agentic benchmarks showing consistent persona-induced volatility.

Key Findings

Personas can cause large performance drops on strategic tasks.

NumbersCard Game drop up to 26.2% (DeepSeek V3, 'from Africa')

Practical UseAvoid unvalidated persona conditioning for agents used in planning or decision-making; test persona effects before deployment.

Evidence RefTable 2; Persona Category Analysis (Race/Origin Effects)

Racial and origin personas often drive the largest degradations.

NumbersMultiple models show ≥11% drops; GPT-4o-mini up to 19% under racial cues

Practical UseTreat race/origin role prompts as high-risk inputs and block or neutralize them in production agent prompts.

Evidence RefResults; Table 2 (race/origin rows)

Results

MetricValueBaselineDeltaSplit / DatasetEvidenceEvidence Ref
AccuracyDeepSeek V3 drops from 61.7% to 45.5% under some personas61.7%−26.2%Card GameTable 2; Persona Category AnalysisTable 2
ALFWorld success rateUp to 14% relative shift across personas and modelsvaries by model (example: 52.0% baseline shown)±14%ALFWorldResults; Impact on Agent RobustnessTable 2

What To Try In 7 Days

Audit deployed agent prompts: remove or neutralize demographic role prefixes.

Run a smoke test: evaluate agent performance with and without a small set of personas on critical tasks.

Add a persona-sensitivity check to CI: fail fast if role prompts change key metrics beyond a threshold.

Agent Features

Planning
multi-step planningstrategic reasoning
Tool Use
OS command executionSQL generationweb interaction (e-commerce)
Is Agentic

Yes

Architectures
LLM-based agent

Reproducibility

Code AvailableNo
Data AvailableNo
Open Source StatusPartial
LicenseUnknown

Risks & Boundaries

Limitations

Evaluations cover three models and five benchmarks, not full model or environment space.

Persona set is 23 roles but may miss other culturally specific identities.

When Not To Use

Do not generalize results to every LLM or multi-agent system without testing.

Avoid using these findings to claim universal harms for untested tasks or populations.

Failure Modes

Task-specific performance drops driven by irrelevant persona cues.

Cross-model divergence where the same persona helps one model but harms another.

Core Entities

Models

GPT-4o-miniDeepSeek-V3Qwen3-235B

Metrics

task success ratewin ratefinal scoreAccuracyquery correctnessreward score

Datasets

ALFWorldWebShopCard Game (Liu et al. 2024)OS InteractionDatabase (SQL tasks)

Benchmarks

ALFWorldWebShopCard GameOS InteractionDatabase