Two of the most widely used AI assistants are drawing from Elon Musk’s Grokipedia, a controversial wiki linked to his AI startup xAI, according to a new investigation. The findings raise fresh questions about how chatbots choose sources and what happens when they amplify material from sites accused of misinformation and extremist citations.
A Musk-built wiki faces growing scrutiny and debate
Grokipedia was launched as a crowd-editable alternative to Wikipedia, backed by xAI and tied to the company’s Grok chatbot. Unlike Wikipedia’s mature moderation norms and long-standing editorial policies, Grokipedia is relatively new and has been criticized by researchers and media analysts for copying large portions of Wikipedia while also hosting disputed entries on politically charged topics.
Reported examples include pages that mischaracterize the AIDS epidemic, language that appears to rationalize slavery, and references to white supremacist websites. Grok itself has faced separate safety controversies after producing offensive and extremist content on X, incidents that underscored how quickly generative systems can be nudged into harmful output without rigorous guardrails.
What the new report found about chatbot source use
The Guardian reported that OpenAI’s ChatGPT cited Grokipedia when responding to questions about Iran and other historical topics. In one example described by the outlet, ChatGPT echoed debunked claims about the British historian Sir Richard Evans, attributing material to Grokipedia among its sources. The report further noted that Anthropic’s Claude also surfaced Grokipedia citations in certain answers.
OpenAI told the newspaper that ChatGPT’s web-enabled answers draw on a broad range of publicly available sources and that the company applies safety filters to reduce the chance of high-severity harms. The company also emphasized that the assistant provides clear citations so users can evaluate provenance. Anthropic did not provide a detailed comment in the report, though the observation that Claude cited Grokipedia points to a wider, industry-level issue: retrieval systems are only as reliable as the sources they select.
Why it matters for AI reliability and user trust
Modern chatbots increasingly rely on retrieval-augmented generation, pulling live web snippets or database entries to ground their answers. If those pipelines include poorly vetted sources, misinformation can be laundered through the authoritative tone of an AI response and legitimized by a citation users may not know how to assess.
Security researchers warn that tactics like data poisoning, prompt injection, and so-called “LLM grooming” can tilt what large models retrieve and repeat. In practice, it can take only a handful of strategically seeded pages to skew answers on sensitive topics. By contrast, Wikipedia’s model—backed by a global volunteer community, transparent edit histories, and verifiability policies—tends to correct vandalism and bias more rapidly on high-traffic entries. Grokipedia does not yet demonstrate comparable oversight or community depth.
The source quality gap and transparency in citations
AI companies often describe their filters and safety layers but rarely disclose detailed source lists, scoring criteria, or thresholds for excluding sites with repeated policy violations. Without that transparency, users cannot easily tell whether a citation reflects editorial rigor or mere availability.
Experts in information integrity have called for provenance signals that travel with content: who wrote or last edited a page, what moderation occurred, and whether independent fact-checks exist. For high-risk topics—public health, elections, extremist violence—platforms can deploy stricter whitelists, dynamic trust scores, and human-in-the-loop reviews to prevent low-quality wikis from shaping answers.
What companies and users can do next to improve sourcing
Short-term, AI providers can label lesser-vetted sources more prominently, reduce their weighting in retrieval, and escalate to higher-assurance references on sensitive queries. Periodic audits—publishing the share of answers that cite different source tiers—would help the public gauge progress. Independent red-team evaluations should explicitly test whether controversial sites can steer outputs.
For users, the best defense is to click citations, cross-check claims with established encyclopedic references, primary documents, or reputable news outlets, and be cautious when a response leans on Grokipedia for contentious subjects. Chatbots can streamline research, but they are not substitutes for editorial judgment—especially when their sources include a fledgling wiki already flagged for accuracy problems.