Giving AI a Distinct Voice Per Brand (Without Fine-Tuning)
How to coax different personalities out of the same model using structured briefs — and why negative constraints work better than positive ones.
The first hundred times I asked a model to write dialogue in a specific character's voice, the output was 80% the model's default voice and 20% whatever I'd asked for. Every character sounded vaguely like the same helpful AI intern in different Halloween costumes.
Then I figured out the trick. It's not about telling the model what the character is. It's about telling the model what the character isn't. Negative constraints carry more weight than positive descriptions, and it's not even close.
Why positive descriptions fail
When you write a brief that says "Ronald McDonald is cheerful, family-friendly, and slightly awkward," the model averages those adjectives together and produces a cheerful, family-friendly, slightly awkward voice. But so does every other character the model writes for. Cheerfulness, family-friendliness, and awkwardness are the model's default. You've added nothing.
The output sounds like an AI playing a character, not the character itself. Readers can tell. They don't know why, but they can tell.
Why negative constraints work
Instead, try: "Ronald McDonald never raises his voice, never uses slang, never acknowledges his own strangeness, and refuses to mention McDonald's menu items by name."
The model now has four specific things it cannot do. It has to write around those restrictions, which forces it to find the character's actual register. The output sounds like someone being careful, which is exactly how Ronald sounds in commercials.
Negative constraints narrow the output space. Positive descriptions widen it. You want narrow.
The three-rule structure
The brief structure that works for me is three rules per character. Always three. Two is too loose, four starts getting self-contradictory.
For the Burger King:
- Never speaks.
- Always smiles, regardless of context.
- Responds to everything with gestures only.
For Ronald:
- Never mentions product.
- Never raises his voice.
- Never acknowledges the King's strangeness, even when directly confronted.
Those two three-rule blocks will produce better dialogue than a paragraph of vibe description. I've tested this dozens of times. The structured version wins every time.
Testing the voice
After you've written the rules, generate a five-line exchange. Then do this: swap the characters' names in the dialogue. If the lines still read correctly with swapped names, your voices aren't distinct enough. Go back and tighten the rules.
When the voices are properly distinct, swapping names produces visible absurdity — lines that obviously belong to one character and not the other. That's the signal you've locked in the characters.
The fine-tuning trap
A lot of teams think the solution is fine-tuning. Collect examples of the character's voice, train a custom model, call it a day.
It's almost never worth it. Fine-tuning costs weeks. Prompt engineering with structured rules takes an afternoon, produces 90% of the quality, and can be updated without retraining. Unless you're running millions of generations a month, fine-tuning is the wrong tool.
The right tool is discipline in the brief. Three negative rules per character. Narrow the output space. Your mascots will sound like themselves for the first time.