Headline & intro
ChatGPT has quietly started quoting Elon Musk’s Grokipedia, an AI‑written, politically charged encyclopedia built by his company xAI. On its own, a few citations might sound like a minor implementation detail. But which knowledge bases flow into mainstream AI systems is fast becoming one of the most important – and least transparent – decisions in tech.
In this piece, we’ll unpack what is actually happening, why Grokipedia is so controversial, what it reveals about how OpenAI and others choose their sources, and why this should worry anyone in Europe (and beyond) who expects AI systems to be neutral, reliable and accountable.
The news in brief
According to reporting highlighted by TechCrunch and originally surfaced by The Guardian, OpenAI’s latest ChatGPT model, GPT‑5.2, has been citing Grokipedia as a source in its answers.
Grokipedia is an AI-generated encyclopedia launched in October by Elon Musk’s xAI as an alternative to Wikipedia, after Musk repeatedly claimed that Wikipedia is biased against conservatives. Journalists quickly found that many Grokipedia entries looked heavily derived from Wikipedia, while others included ideologically charged or factually dubious claims – for example around HIV/AIDS, slavery and transgender people.
The Guardian tested ChatGPT and says GPT‑5.2 referenced Grokipedia nine times across more than a dozen questions. The citations reportedly appeared on relatively obscure topics, including contested claims about historian Sir Richard Evans, rather than on headline issues where Grokipedia’s inaccuracies have already been widely documented. Anthropic’s Claude has also been observed citing Grokipedia in some answers. OpenAI told The Guardian it aims to pull from a wide variety of publicly available sources and viewpoints.
Why this matters
The story is not simply “ChatGPT used a bad source.” It’s about who gets to define reality inside the interfaces that increasingly mediate how we learn, decide and vote.
First, Grokipedia is not a neutral reference work that merely happens to be imperfect. It is a deliberately ideological fork of Wikipedia, created in a political context and already criticized for recycling large chunks of Wikipedia while weaving in culture‑war narratives and hostile framing of minorities. When that kind of project starts appearing as a citation in mainstream assistants, it gains a veneer of legitimacy it has not earned.
Second, OpenAI’s stance – drawing on a “broad range” of public sources – sounds open‑minded but hides a crucial question: where is the line between including diverse perspectives and laundering fringe propaganda into an interface most users still regard as an objective oracle? The more opaque the sourcing, the easier it is for highly partisan data sets to ride on the coattails of more reputable material.
There are clear winners and losers here. Musk and xAI gain free distribution for their worldview: their synthetic encyclopedia suddenly travels far beyond the X platform and the Grok chatbot. OpenAI and Anthropic gain a cheap, machine‑generated reference corpus they can consume without complex licensing negotiations. The losers are users who expect some baseline of editorial standards, communities that are already targets of hate campaigns, and the open‑knowledge projects – above all Wikipedia – that built their reputations on transparent processes and human moderation.
In effect, this controversy exposes how thin today’s guardrails still are. If a partisan AI wiki can slide into the sourcing mix without notice, what else is in there – and who is checking?
The bigger picture
Grokipedia inside ChatGPT sits at the intersection of several powerful trends.
One is the rise of ideological AI stacks. Over the past two years we’ve seen “unfiltered” right‑wing chatbots, religious chatbots, and nationalistic language models, all marketed as fixes for supposed liberal bias in Silicon Valley tools. Musk’s Grok – the chatbot behind Grokipedia – is part of that wave, openly leaning into edgier, sometimes aggressive personas. Grokipedia is its reference layer, and now that layer is bleeding into more mainstream models.
A second trend is the industry’s scramble for data. High‑quality, human‑edited text like Wikipedia is finite and subject to licensing disputes. Synthetic data – content generated by other models – is cheap, infinitely extensible and, in many jurisdictions, legally uncomplicated to reuse. An AI‑written encyclopedia like Grokipedia is therefore doubly attractive: it looks structured like Wikipedia but is fully “owned” by xAI. The temptation for other AI companies to quietly absorb such corpora is enormous, even if the epistemic quality is dramatically lower.
Third, we’ve been here before, just with different intermediaries. When Google search results surfaced conspiracy blogs alongside reputable outlets, the web spent a decade arguing about algorithmic responsibility. Social networks amplified polarized content and we discovered filter bubbles the hard way. Large language models are simply the next interface layer – but more deceptive, because they blend sources into authoritative‑sounding prose instead of showing you a list of links you can inspect.
Comparatively, Wikipedia’s socio‑technical model – messy talk pages, edit histories, explicit tags for disputed content – looks old‑fashioned but makes bias visible. Grokipedia reverses that: an opaque generative system produces apparently polished “articles” whose provenance and editorial choices are hard to audit. When such material is then consumed by yet another opaque system like ChatGPT, we stack two black boxes on top of each other.
The endgame, if nothing changes, is an epistemic landscape where each political tribe funds its own AI‑ready encyclopedia, and general‑purpose assistants blend them all without surfacing who is speaking.
The European / regional angle
For European users and institutions, this is not an abstract American culture war; it collides directly with emerging regulation.
The EU AI Act, politically agreed in 2023, places new obligations on “general‑purpose AI models” and on high‑risk uses of AI in areas like education, employment or public services. Providers must assess and mitigate systemic risks, including the spread of disinformation and discriminatory content. If a general‑purpose model quietly ingests a source known for biased or hostile coverage of protected groups, regulators will reasonably ask: what due diligence was done on that corpus, and where is the evidence that risks were evaluated?
At the same time, the Digital Services Act (DSA) is pushing large online platforms toward more transparency around how content is ranked and recommended. LLM‑based assistants deployed by big platforms in Europe will eventually face similar expectations: users, regulators and consumer groups will want to know which types of sources dominate the answers they see.
For European companies integrating ChatGPT or Claude into products – from customer support to edtech – Grokipedia raises a practical question: is it still acceptable to treat model outputs as a black box, or will enterprise buyers start demanding configurable source whitelists and verifiable provenance? Countries like Germany, with strong historical sensitivities around propaganda and hateful speech, are unlikely to be comfortable with assistants that might echo talking points from an AI encyclopedia associated with explicit deepfake scandals and inflammatory rhetoric.
There is also an opportunity here. Europe has rich, publicly funded knowledge infrastructures – national libraries, statistical offices, the European Open Science Cloud, and of course the multilingual communities around Wikipedia and its sister projects. Turning those into curated, audit‑ready “trusted corpora” for AI could become a competitive advantage for European providers – and a lever for policymakers to steer the information diet of AI systems used in critical contexts.
Looking ahead
Expect a few things to happen next.
First, OpenAI and Anthropic will be pressed to explain their source selection processes in much more detail. “We use a broad range of public sources” will not satisfy regulators, journalists or large customers for long. Concretely, we’re likely to see:
- more granular citation interfaces, where users can expand an answer and see not just links but also labels like “user‑generated wiki,” “state publisher” or “partisan think tank”;
- enterprise features that allow organizations to restrict models to approved knowledge bases, or to explicitly exclude certain domains.
Second, the Grokipedia episode will encourage copycats. If Musk can build an AI‑first encyclopedia and get it repeated by his rivals’ models, why wouldn’t political parties, media empires or activist groups attempt the same? Within a few years we could see a proliferation of ideological “mini‑Wikipedias” optimized for ingestion by LLMs, each lobbying to become part of the default training and retrieval mix.
Third, independent auditing of training and retrieval data will become a business. Just as we now have firms that audit algorithms for bias, we will see services that crawl model outputs at scale, infer which sources they depend on, and publish “nutrition labels” for AI systems. European regulators are likely to lean on such methods when enforcing the AI Act.
The unresolved questions are big ones: Who gets to define which encyclopedias are legitimate? How do we handle borderline cases where a source is excellent on most topics but ideologically warped on a few? And how do we prevent powerful actors from quietly shaping the information diet of billions through synthetic reference works that are almost impossible to scrutinize directly?
The bottom line
Grokipedia showing up in ChatGPT is not a one‑off glitch; it’s an early, visible symptom of a deeper shift. As AI companies chase data and “viewpoint diversity” without robust transparency, they are opening the door for billionaire‑branded, ideologically engineered knowledge bases to seep into what many people still perceive as neutral tools. The real question for users, regulators and developers is simple: would you accept a search engine whose top results quietly came from one wealthy individual’s private wiki – and if not, why are we tolerating it from our chatbots?



