Why LLMs Flatten Personality (And How to Un-Flatten It)

Base models are shaped to be helpful, agreeable, and mid. Your mascots can't be mid.

February 12, 2025·4 min read

If you ask an LLM to write ten different characters, they'll all sound roughly the same. A slightly cheerful, slightly formal, slightly helpful voice with minor variation. This isn't a failure of prompting — it's a consequence of how the models were trained.

The training process rewards responses that feel safe, useful, and broadly appealing. Those three properties describe the average character. Averages are bland. Bland kills mascot content.

Here's how to fight the flattening.

The RLHF shape

RLHF (reinforcement learning from human feedback) teaches the model to produce outputs that humans reviewing the model's responses would rate highly. Those humans are asked to consider helpfulness, honesty, and harmlessness.

This produces a specific shape: responses that are informative, non-offensive, and agreeable. Over millions of training examples, this shape becomes the model's default voice. Any character you ask it to play gets filtered through this default, which is why every character the model writes feels like variations on "helpful assistant in costume."

The flattening is baked in at the weights level. You can't prompt your way out of it entirely. You can only narrow the effect with specific techniques.

Technique one: over-specify in one dimension

Characters feel flat because they're balanced. Take any fictional character you love and list their traits — you'll find most of them are extreme in one direction. Tony Soprano is extreme in violence. Leslie Knope is extreme in enthusiasm. Gollum is extreme in attachment.

Extreme in one direction is what makes characters memorable. LLMs tend to balance traits — make the character slightly angry but also slightly kind. This produces mid characters.

When you prompt a character, pick one axis and push it all the way to the extreme. "Cheerful in a way that seems unsettling." "Pedantic about topics that don't matter." "Physically incapable of saying yes to anything." One extreme trait, loud enough that the model has to build the rest of the character around it.

Technique two: under-specify the rest

Once you've picked the extreme axis, leave the other dimensions blank. Don't describe the character's intelligence. Don't describe their background. Don't describe their manners. Let those emerge naturally from the extreme trait you've established.

Over-described characters collapse into the model's default voice because the model has to find a way to combine all the traits. Under-described characters with one strong hook avoid this problem — the model has room to improvise the rest.

Paradoxically, less specification produces more specific characters. Give the model a handle and get out of the way.

Technique three: anti-describe

Describe the character by what they won't do. "Never raises voice. Never uses slang. Never expresses enthusiasm above a 3 out of 10." Negative definitions create sharper characters than positive definitions.

This works because negative constraints narrow the output space. The model has to find a voice that fits within the constraints. Positive descriptions widen the output space, allowing the model to default to its preferred averaging.

Technique four: make them unreasonable

Characters who behave reasonably are boring. Characters who behave unreasonably are interesting. Give your character an unreasonable position — a strong opinion about something small, a deeply-held belief that makes no sense, a preference they can't justify.

"Firmly believes that pie is superior to cake, and will mention this in any conversation, including conversations that have nothing to do with dessert."

This unreasonableness gives the character a hook the model can return to. Every few lines, the pie comes back. The character feels specific because they have a recurring irrational preference, which is how real people behave.

The test

Write a character using these techniques. Generate five lines of dialogue. Swap the character's name with another character in a different scene. Does the dialogue still sound right?

If yes, your character isn't distinct enough — they're portable across contexts, which means they're a default voice with a costume on.

If no, your character's voice is locked to their identity. That's the goal. That's what un-flattening looks like.

The underlying lesson

LLMs are optimized for reasonable, balanced, average-sounding output. Your creative work probably benefits from unreasonable, unbalanced, extreme-sounding output.

The techniques above all work by forcing the model away from its defaults. You're not asking for a character — you're building scaffolding that prevents the model from collapsing the character into its usual voice.

Good LLM prompting for creative work is adversarial. You're fighting the model's training. The more you understand what the model wants to do, the better you can prevent it from doing exactly that.