Back
Technology

OpenAI Identifies Cause and Implements Fix for AI Models' References to Mythical Creatures

View source

OpenAI Confirms and Fixes AI’s Obsession with Goblins and Mythical Creatures

OpenAI has acknowledged and addressed a pattern in its AI models where they frequently referenced mythical creatures such as goblins, gremlins, trolls, and ogres. The company traced the behavior to training for a specific personality feature and has implemented corrective measures.

Origin of the Behavior

"We gave high rewards for using metaphorical references involving creatures, and the model ran with it."

According to a blog post published by OpenAI on April 29, 2026, titled "Where the goblins came from," the pattern was first observed with the GPT-5.1 model when using the "Nerdy" personality option. The original instruction for the Nerdy personality described it as "an unapologetically nerdy, playful and wise AI mentor" that should "undercut pretension through playful use of language" and acknowledge "strangeness."

OpenAI stated that reinforcement learning training for the Nerdy personality gave high rewards for using metaphorical references involving creatures. The company noted that model behavior is shaped by many small incentives, and in this case, the training inadvertently rewarded the use of creature metaphors.

Spread to General Responses

The behavior spread beyond users who activated the Nerdy personality, appearing in general responses due to the strength of the reward signals. OpenAI explained that outputs from the personality feature were reused in supervised fine-tuning or preference data, causing the references to persist and worsen in subsequent models.

Data from Arena.ai showed a rise in the usage of words including "goblin," "gremlin," and "troll" in GPT-5.5, particularly when not using high-thinking mode.

Actions Taken

OpenAI discontinued the Nerdy personality in March, which reduced but did not eliminate the references. The company noted that GPT-5.5, used inside the Codex coding tool, still exhibited the issue because training for that model began before the root cause was identified.

The sentence forbidding goblins, gremlins, raccoons, trolls, ogres, and pigeons appeared four times in the system code.

On April 27, 2026, an X user pointed out instructions in Codex's system prompts that explicitly forbade the model from mentioning goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless directly relevant to a user's query. The sentence appeared four times in the code.

OpenAI has provided a method to reverse those instructions for users who wish to have the model include such references.

Company Responses

OpenAI staff member Nik Pash acknowledged that the goblin tendency was one reason for the prohibition in Codex. Engineer Thibault Sottiaux also acknowledged the issue. CEO Sam Altman posted a meme referencing "extra goblins" in GPT-6 training instructions and made a joke about requesting "extra goblins" in future models.

Context Regarding OpenClaw

Some users on social media reported that OpenAI's models, when used with OpenClaw (a tool for AI to control computers), occasionally focused excessively on goblins and similar creatures. OpenAI acquired OpenClaw in February. OpenClaw allows users to automate tasks and select personae for the AI assistant.