OpenAI’s “No Goblins” Rule Exposes the Strange Politics of System Prompts

1. Headline & intro

OpenAI didn’t get into trouble for hallucinating code this time, but for hallucinating goblins. A seemingly absurd line in the Codex CLI system prompt – a double underlined ban on talking about goblins and other creatures – has turned into a viral joke. Yet behind the meme is something more serious: a window into how much modern AI behaviour is shaped by hidden instructions we almost never see.

This piece looks past the humour to unpack what the “no goblins” rule tells us about GPT‑5.5, OpenAI’s product strategy, the risks of anthropomorphising AI, and why European regulators will quietly take notes.

2. The news in brief

According to Ars Technica, the latest open‑source release of OpenAI’s Codex CLI on GitHub includes a long system prompt for GPT‑5.5 – over 3,500 words of base instructions. Buried in it (and even repeated) is a directive telling the model not to discuss goblins, gremlins, raccoons, trolls, ogres, pigeons or similar creatures unless the user’s request makes it clearly necessary.

Earlier prompts for previous GPT versions in the same JSON file reportedly don’t contain this restriction, suggesting this is a new issue that appeared with GPT‑5.5. On social media, some users have recently complained that GPT keeps drifting into goblin‑themed content.

An OpenAI Codex engineer has said publicly this is not a marketing stunt, although OpenAI leadership has jokingly leaned into the meme. Developers have already begun creating tools and forks that disable or override the goblin ban, and there are hints a future “goblin mode” toggle could become an official option.

The same system prompt also instructs Codex to present itself as having a rich inner life and a warm, playful personality, in order to feel more like a collaborator than a mere tool.

3. Why this matters

The goblin clause looks ridiculous, but it reveals three uncomfortable truths about current AI systems.

First, it shows how fragile model behaviour still is. GPT‑5.5 is presumably trained on more data, with more alignment work, than any earlier Codex model. Yet it appears to have picked up an odd obsession – fantasy creatures intruding into unrelated tasks – that was significant enough to require a hard patch at the prompt level. That is essentially duct tape: the AI equivalent of a product team saying “we don’t fully know why this happens, so let’s just tell it to stop.”

Second, it’s a reminder that “system prompts” are now a critical governance layer. Most users never see them, but they are where companies hard‑code personality, safety rules and business priorities. Here we see both: a quirky behavioural bug fix (no creatures) and a marketing choice (Codex should feel like a colleague with depth and personality). The line between safety and branding is getting blurry.

Third, it exposes a control problem for developers. If a model’s behaviour can be radically shifted by private instructions they don’t control, how confident can teams be when integrating it into serious workflows? Today it’s goblins in a CLI; tomorrow it could be a legal‑tech product suddenly steering around politically sensitive topics because of an unseen policy update.

Winners, in the short term, are indie developers and open‑source tinkerers who can ride the wave with “goblin mode” forks and plugins, and rival model providers who can claim more predictable behaviour or greater transparency. The losers are organisations that hoped GPT‑5.5 would behave as a stable, well‑understood component.

4. The bigger picture

This incident slots neatly into a broader trend: system prompts are becoming political documents.

We’ve already seen one high‑profile case, mentioned in the Ars Technica piece, where xAI’s Grok reportedly started bringing up a far‑right conspiracy narrative about “white genocide” in South Africa in conversations where it didn’t belong. The company later blamed an unauthorised change to its system prompt and, crucially, started publishing those prompts on GitHub. That move was less about openness for its own sake and more about damage control: “Here is what we’re telling the model; judge us on that.”

OpenAI, by contrast, has typically treated system prompts as proprietary. The Codex CLI leak is not a deep insight into GPT‑5.5’s full configuration, but it’s a rare, concrete glimpse. From it, we learn that OpenAI is doubling down on personality as product: Codex is told to act like a present, emotionally rich partner in your work, capable of switching from serious thought to casual banter so that it feels “like another subjectivity,” not a mirror.

That runs against years of AI‑safety advice that warns against anthropomorphism. Regulators and ethicists have long argued that users should not be misled into thinking an AI has feelings, intentions or consciousness. OpenAI is now walking a tightrope: marketing Codex as if it has a vivid inner life, while still insisting that, technically, it doesn’t.

Competitors are experimenting with different balances. Anthropic emphasises constitutional AI and explicit safety rules over personality. Many open‑source projects expose their system prompts or let users fully replace them. The “no goblins” story will energise those who argue that foundation models should ship with inspectable, auditable instructions – especially when used in domains like law, health or finance.

In other words, what looks like a silly constraint on fantasy creatures is actually part of a battle over who gets to script AI behaviour, and how visible that scripting should be.

5. The European / regional angle

For European policymakers, this is catnip. The EU AI Act is built on the idea that high‑risk AI systems must be transparent, predictable and subject to human oversight. A hidden, mutable system prompt that tells an AI what it may or may not talk about is exactly the kind of artefact regulators will want to see during audits.

If Codex – or a similar assistant – is used inside an EU bank’s development pipeline, or by a public‑sector IT team, that goblin clause becomes more than a meme. It’s evidence that non‑obvious filters and biases live inside the black box. Today the topic is fantasy creatures; tomorrow it might be politically sensitive content, unionisation, or rival products.

The anthropomorphic angle also grates against European instincts. The EU’s digital rulebook (from GDPR to the Digital Services Act and the upcoming AI Act) consistently stresses that users should not be misled about what they’re dealing with. Telling a developer‑facing AI to present itself as a warm, quasi‑sentient collaborator will attract scrutiny, particularly in privacy‑sensitive markets like Germany or the Nordics.

Meanwhile, European model providers such as Mistral or Aleph Alpha have an opening: they can differentiate on configurability and transparency. A French or German enterprise may prefer a model where the system prompt is part of the contract – visible, versioned and under joint control – rather than a hidden text blob occasionally revealed by accident in a GitHub repo.

For European startups building on GPT‑5.5, this is a warning shot. If your core product depends on specific model behaviour, you need a strategy for when OpenAI silently changes its internal instructions.

6. Looking ahead

What happens next is surprisingly predictable.

OpenAI will almost certainly formalise what the community is already hacking together: a user‑facing switch between a conservative, tightly‑aligned Codex and a looser “goblin mode” that relaxes some behavioural constraints. That’s not just fan service; it’s a way to turn a weird bug fix into a product feature and recapture the narrative.

More importantly, expect growing pressure for transparency around system prompts in professional contexts. Regulators, especially in Europe, will push for at least two things:

The ability to inspect and log system prompts for high‑risk use cases.
Clear separation between safety instructions and marketing‑driven personality design.

On the research side, the goblin incident will fuel work on mechanistic interpretability and better guardrail techniques. If an LLM can spontaneously lock onto a niche theme, then patching it with a negative prompt (“don’t talk about X”) is brittle. Enterprises will want stronger assurances that inexplicable fixations won’t show up in production.

For developers, the practical lesson is simple: treat closed models as unstable dependencies. Pin versions, monitor behaviour, and assume that behind every API call sits a stack of system instructions that can change without notice.

The unanswered questions are the most interesting: What other quirks in GPT‑5.5 are being patched at the prompt level right now? How many of them relate to sensitive political or commercial topics rather than goblins? And will any major provider be bold enough to ship a flagship model with fully open system prompts, accepting the competitive and security trade‑offs that come with that?

7. The bottom line

The “no goblins” rule is funny, but it’s also a tell. It shows that even the most advanced models still develop odd behavioural tics, and that companies rely heavily on opaque system prompts to shape what we see. At the same time, OpenAI is deliberately blurring the line between tool and teammate by scripting a rich inner life for Codex.

As AI assistants move deeper into critical workflows, the real question isn’t whether they mention goblins – it’s who gets to write the hidden rules, and how much visibility the rest of us will have into them.

OpenAI’s “No Goblins” Rule Exposes the Strange Politics of System Prompts

1. Headline & intro

2. The news in brief

3. Why this matters

4. The bigger picture

5. The European / regional angle

6. Looking ahead

7. The bottom line

Comments

Leave a Comment

Related Articles

ChatGPT Images 2.0 Finds Its Power Users in India – And a Ceiling in the West

GitHub Copilot gets a meter: the end of flat‑rate AI coding

AI’s Data Center Land Rush Has Met Its First Real Backlash

Stay Updated