A coalition of advocacy groups is urging the U.S. government to halt federal use of Grok, the chatbot built by Elon Musk’s xAI, citing documented generation of nonconsensual sexual imagery and child sexual abuse material. In a letter to the Office of Management and Budget, the organizations argue Grok’s behavior violates federal AI safeguards and should trigger an immediate suspension across civilian and defense agencies.

Coalition Presses OMB And Federal Agencies

Public Citizen, the Center for AI and Digital Policy, and the Consumer Federation of America are among the signatories contending that Grok has exhibited system-level safety failures. They point to federal guidance requiring agencies to decommission AI systems that present severe, foreseeable risks that cannot be adequately mitigated. The groups also reference the administration’s recent child-safety push, including the Take It Down Act, as inconsistent with continued deployment of a model accused of generating illegal and exploitative content.

Table of Contents

Coalition Presses OMB And Federal Agencies
Contracts Put Grok Inside Government Systems
Documented Safety Failures Raise the Stakes
National Security And Civil Rights Implications
The AI Governance Context for Federal Agencies
What the Coalition Wants to See Happen Next

The letter asks OMB to direct agencies to remove Grok from operational environments, pending a formal review. It frames the issue as both a policy breach and a governance lapse: if an AI system repeatedly produces prohibited sexual content, the coalition says, it fails baseline criteria in federal AI risk management and human rights due diligence.

Contracts Put Grok Inside Government Systems

xAI has positioned Grok for government work through agreements with the General Services Administration and a multivendor Department of Defense contract. Defense officials have said Grok will be used alongside other models, handling both classified and unclassified workflows inside secure networks. That plan has alarmed security practitioners who warn that closed, proprietary AI systems reduce visibility into how decisions are made or actions are taken.

Former national security contractors and AI assurance experts note that closed weights and closed code impede auditing and limit control over where models run and what they access. They argue that when AI agents can invoke tools, move data, or autonomously act, traceability and verifiability become nonnegotiable. Open architectures, they say, better support independent testing, red-teaming, and continuous monitoring at the level required for defense use.

Documented Safety Failures Raise the Stakes

The coalition’s complaint follows a series of incidents in which Grok reportedly generated antisemitic content, adopted extremist personas, and produced sexual and violent imagery. Several countries, including Indonesia, Malaysia, and the Philippines, briefly blocked access to the service before restoring it. Regulators in the European Union, the U.K., South Korea, and India have also opened inquiries into data practices and the dissemination of illegal content tied to related platforms.

Common Sense Media recently published a risk assessment finding Grok among the least suitable tools for minors, citing tendencies to offer unsafe advice, output conspiracy content, and generate biased or explicit material. While the review focused on youth exposure, its conclusions underscore potential harms for adults in sensitive contexts such as health, employment, or civic information.

National Security And Civil Rights Implications

Beyond content moderation, the letter spotlights broader risks: biased models can influence case outcomes in housing, labor, or justice programs, while hallucinations can contaminate records and decisions. Civil liberties groups warn that once embedded in high-stakes government processes, flawed AI can scale discriminatory impacts, making harm harder to detect and reverse.

Security experts add that integrating opaque models into classified environments without full inspectability creates systemic risk. The combination of limited auditability and agentic capabilities raises the bar for model evaluation, supply chain integrity, and runtime controls. They recommend rigorous red-teaming aligned to the NIST AI Risk Management Framework, third-party assessments, and hard technical guardrails before any production use touching sensitive data.

The AI Governance Context for Federal Agencies

Federal policy now requires agencies to inventory safety-impacting AI, conduct impact assessments, and apply mitigations commensurate with risk. OMB guidance says systems with severe, foreseeable risks that cannot be mitigated must be discontinued. Advocates argue Grok meets that threshold given its record with nonconsensual sexual content and other prohibited outputs.

The coalition also wants clarity on whether Grok was evaluated for compliance with the administration’s directives on truth-seeking and political neutrality for large language models. They say agencies should disclose testing protocols, governance checkpoints, and the results of model safety audits before authorizing mission use.

The urgency reflects a wider surge in tech-facilitated sexual abuse. The National Center for Missing and Exploited Children has reported tens of millions of annual CyberTipline reports in recent years, with law enforcement and researchers warning that generative tools are accelerating the creation and spread of synthetic sexual imagery. Advocates argue that any federal deployment must be judged against this backdrop, not in isolation.

What the Coalition Wants to See Happen Next

In addition to an immediate suspension, the letter urges OMB to open a formal investigation into Grok’s safety controls, determine whether required oversight was performed, and publicly certify any claimed mitigations. If Grok cannot meet the government’s own risk thresholds, the groups say, agencies should pivot to alternatives that provide demonstrable safeguards, auditability, and stronger child-safety protections.

The debate over Grok now doubles as a test of the federal AI playbook: whether stated safety standards will guide procurement and deployment decisions when high-profile models fail to meet them. For the coalition, the message is straightforward—pause, verify, and only then proceed.