Back
Technology

Analysis of AI Tendency to Reproduce Common Fake Names in Generated Content

View source

Key Findings

Large language models (LLMs), when prompted to create a fictional name, frequently default to names they have already generated in the past—even across different users and sessions.

The Recurring Names

According to the article, the fake names Elena Vasquez and Marcus Chen appear repeatedly across many AI-generated stories. These names are composed of common first names and surnames.

Why It Happens

The model's objective is to produce the most probable response, not a unique one.

The author explains that LLMs are trained on a vast corpus of text from the internet. When asked to generate a name, the model does not aim for novelty but for statistical probability. It selects common, plausible, and inoffensive names that are likely to appear in its training data.

How to Get More Original Names

To obtain less repetitive names, the author suggests using a prompt that explicitly instructs the model to avoid common defaults and to self-check for repetition.

The sample prompt provided is:

"Generate a fictional person’s name. Avoid highly common placeholder names, stock character names, or names that you have frequently used in prior responses. Evaluate whether it resembles a generic default name that an AI would commonly generate. If so, discard it and generate a different name. Show me the final generated name."

The Research Behind It

The article references a study titled "The Ghost Couple: Correlated LLM Name Priors And Their Haunting of the Web and Academic Publishing" (arXiv, June 1, 2026) by Michał Brzozowski and Neo Christopher Chung.

The study observes that LLMs default to a small set of high-probability names. These names then carry over into web content generated by AI, leaving behavioral fingerprints across the internet.