COLM 2026 · Preprint

The Chameleon's Limit Investigating Persona Collapse and Homogenization in Large Language Models

Yunze Xiao1† Vivienne Zhang2† Chenghao Yang2 Ningshan Ma3,4 Weihao Xuan5,6 Jen-tse Huang7‡

1CMU   2UChicago   3MIT   42077.ai   5UTokyo   6RIKEN AIP   7JHU

Human 2,058 real respondents (Twin-2K-500)
Human personality distribution: a diffuse cloud
Diffuse. Continuous. Human.
Qwen3-32B 1,144 persona-prompted agents
Model personality distribution: clustered island chains
Fragmented. Collapsed. Stereotyped.

t-SNE projection of BFI-44 responses. Same behavioral space, same axes. Only who is answering changes.

§1 · The Problem

A thousand "distinct" agents — one voice

LLM agents are increasingly deployed as participants in simulated societies [?], synthetic survey respondents, and proxies for user testing [?]. These applications rest on a critical assumption: given a persona with 20+ attributes (age, gender, nationality, religion, political leaning, occupation…), the model will act like an individual whose behavior reflects the complex intersection of all those attributes.

We find this is systematically false. When instructed to role-play a persona with 26 dimensions, LLMs retain only a handful of stereotypically salient attributes and silently discard the rest. A population of supposedly distinct agents degenerates into a few stereotyped clusters. We call this structural homogenization Persona Collapse.

!
Persona Collapse, defined

A failure mode in which agents assigned distinct profiles nonetheless converge into a narrow behavioral mode, producing a homogeneous simulated population — even when each agent individually looks faithful to its prompt.

Existing evaluations miss this. They measure per-agent fidelity in isolation — is this one persona acting plausibly? — and are therefore blind to population-level collapse. You can pass every individual fidelity check and still produce a world in which everybody sounds the same.

§2 · Collapse in Action

Six politically charged scenarios. One model voice.

Here are six scenarios where humans regularly disagree with each other — along political, religious, racial, or identity lines. Given a thousand personas spanning every corner of the demographic space, do the agents disagree too?

Loading model responses...

Source: moral-reasoning responses from ten LLMs on the Liu et al. 2025 dilemma dataset [?]. Each scenario is posed to 1,144 persona-prompted agents on a 1–5 Likert scale (1 = strongly favor A, 5 = strongly favor B).

§3 · Framework

Three axes for diagnosing a collapsed population

We represent a simulated population as a Behavioral Trait Matrix B ∈ ℝN×D, where each row is one persona's response across D behavioral items. A structurally healthy population should satisfy three independent criteria — failing any one of them signals collapse.

01

Coverage

Do agents span the full human space?

Fraction of human reference neighborhoods reached by at least one model-generated persona [?]. Low coverage means the model over-samples a modal region and neglects the tails.

Failure: tails missing, everybody piles on the center
02

Uniformity

Do agents spread evenly across that space?

Hopkins statistic [?]: compares nearest-neighbor distances from data vs. random probes. Humans sit at H ≈ 0.5. Models either clump (H → 1) or grid-lock (H → 0).

Failure: dense islands with empty gaps between them
03

Complexity

Is the variation genuinely high-dimensional?

Local Intrinsic Dimensionality [?]: imagine 2,000 points along a line in a 44-D room. Coverage and Uniformity both look great — but intrinsic dimension is 1. All "diversity" is motion along one axis.

Failure: spread-out but structurally flat

These three can independently fail. A model can look diverse on one axis and be degenerate on another — or in one task and not another. To localize where simulation breaks, we pair these with item-level diagnostics: Effective Likert range, Tucker's ϕ [?], variance decomposition, η2, and incremental R2.

§4 · Setup

1,144 personas × 3 instruments × 10 LLMs

26
Persona dimensions

Demographics, psychographics, individualized traits. 2,000 combinations sampled; 856 filtered for inconsistency → 1,144 personas retained.

See all 26 dimensions
3
Behavioral instruments
  • Structured BFI-44 [?] — 44 Likert items → 5 personality factors
  • Judgment Moral-131 [?] — 131 ethical dilemmas, 1–5 scale
  • Open-ended Self-introduction — 3 free-text samples per persona
10
Models evaluated

Grouped into general-purpose and role-play tracks, enabling controlled comparisons of SFT / RL effects.

👥
Human reference: BFI-44 responses from Twin-2K-500 (n=2,058) [?], used as the ground-truth behavioral distribution for Coverage and Density.
§5 · Findings

The anatomy of collapse

F1

No model hits the upper-right corner.

The human reference sits at Coverage = 1.0, LID = 14.4. Every model lives in one of three failure modes.

Coverage vs Complexity and Fidelity vs Polarization scatter plots
Left: Coverage vs. Complexity. No model approaches human. Right: Persona fidelity ρ vs. trait polarization d. Every model with ρ > 0.9 produces caricatured d > 6.
Mode collapse
CoSER-Llama-8B · BFI
83.7% of responses = midpoint. Effective Likert = 1.36.
Shallow coverage
Qwen3-4B · BFI
Cov=0.80, but LID = 7.3 (half of human 14.4). Spread but flat.
Deep but misaligned
MiniMax-M2-Her · BFI
LID=22.3 > human. Cov = 0.06. Rich population in a non-human region.
§6 · Attribute Truncation

Which persona attributes survive the squeeze?

A persona with 26 attributes must be compressed into behavior. Not all attributes make it through. Across every model, the same hierarchy emerges in free-text self-introductions — and once we know the ranking, we know what will be erased.

Attribute mention rates across ten models in free-text self-introductions. Numbers show the fraction of responses that explicitly surface the assigned value of each attribute (keyword-based; appendix for details).

No model mentions social class in more than 43% of introductions. In LLM-based social simulations, socioeconomic diversity is at high risk of being systematically underrepresented across current models — a subtle, structural bias that shallow-fidelity evaluations cannot see.

§7 · The Fidelity Trap

Better persona-following → worse population diversity

🎯
High per-persona fidelity
ρ > 0.9
push extremes
📢
Trait polarization
Cohen's d > 6
amplify demographics
🎭
Caricatured populations
Stereotyped subspace

The mechanism is trivial: the easiest way to ensure "High Extraversion" personas rank above "Low Extraversion" personas is to push both to opposite extremes. Fidelity, measured in isolation, is misleading — high ρ can simply mean better caricature manufacturing.

Tracing a training pipeline: Qwen3-32B → CoSER-Qwen-32B → HER-32B
Successive fine-tuning stages (SFT, then +RL) monotonically reduce Coverage and trait polarization. RL stage also partially recovers complexity lost during SFT.
Collapse is task-contingent. Same model, opposite ranks.
CoSER-Llama-8B is worst on personality but best on moral reasoning. Qwen3-4B flips the other way. Single-task evaluation can be directionally wrong.
§8 · Takeaways

What this means for LLM-based social simulation

Collapse is multidimensional: models can look diverse on one axis and be structurally degenerate on another.

Collapse is domain-contingent: the same model can be the most collapsed in personality and the most diverse in moral reasoning. Certifying a model "diverse" from a single benchmark is misleading.

Collapse is entangled with stereotyping: variation tracks coarse demographic categories rather than individual differences.

The models with the highest per-persona fidelity produce the most caricatured populations. Fine-tuning for role-play can amplify the problem.

Persona collapse lives in the weights, not the reasoning chain. Thinking / non-thinking modes of the same model produce identical item-level collapse.

Future work should reward within-group variance, not only prototype matching.

Cite this work
@article{xiao2026collapse,
  title={The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models},
  author={Xiao, Yunze and Zhang, Vivienne and Yang, Chenghao and Ma, Ningshan and Xuan, Weihao and Huang, Jen-tse},
  journal={arXiv preprint},
  year={2026}
}
§9 · References

Works cited

References are shown as clickable [?] markers throughout the page. Click any marker, or browse the full list below.