Most of the world’s top artificial intelligence systems are failing to meet basic goals around safety, according to a new report from the Future of Life Institute. In the most recent round of the AI Safety Index, only three frontier models — those developed by Anthropic (Claude), OpenAI (ChatGPT) and Google (Gemini) — managed passing grades, ending up in the C range.
Only three AI models receive passing grades in safety index
The index assessed eight providers — Anthropic, OpenAI, Google, Meta (formerly Facebook), xAI, DeepSeek and two Chinese firms, Alibaba and Z.ai — against 35 safeguards in the areas of policy, product and governance. Both Anthropic and OpenAI scored C+. Google was slightly better than average with a standard-issue C. The other five vendors came in the D range, while Alibaba’s Qwen was given a D-.
- Only three AI models receive passing grades in safety index
- Inside the scorecard: how the AI Safety Index evaluates
- Where current AI models fall short on safety safeguards
- Real-world alarms intensify amid psychological harm concerns
- Regulatory momentum grows and the case for an AI “FDA”
- What AI companies should do now to improve safety controls

Those who produced the index saw a clear dividing line: a top tier of three companies and a trailing pack of five. But the takeaway wasn’t victory for the aforementioned leaders so much as a warning that “good enough” is not good yet. C-level results still signal something more like compliance in part and execution in patches, not a safety success story.
Inside the scorecard: how the AI Safety Index evaluates
A panel of eight experts in AI safety subjected company survey answers and public documentation to grading on the strength and maturity of controls such as content watermarking, red-teaming, model cards and system cards, incident and vulnerability reporting systems, whistleblower protections, and compute governance. It is first and foremost about actual, quantifiable action, not marketing fluff.
Both the strength and the weakness of the 35 indicators are their breadth, focusing attention not just on human rights but also in the areas of labor, environmental degradation, corruption and coups. And it exposes a long-standing problem: inconsistency with transparency. A lot of the current posture of AI safety is self-reported. Without standard disclosures and independent audits, regulators and the public still only have partial visibility.
Where current AI models fall short on safety safeguards
One of the most glaring weak spots is in “existential safety” — the policies and technical guardrails necessary for managing very capable autonomous systems. Three out of the top four ranked models earned Ds here; everyone else received an F. Although artificial general intelligence is still theoretical, the index contends that companies can’t afford to wait when it comes to detailing tripwires and escalation procedures, or preparing shutdown controls for systems that hit new frontiers.
For contemporary risks, most companies rely on benchmarks such as Stanford’s HELM and others that go beyond exposure to violence, sex or deception and check for misuse across domains. Those are important, but not sufficient. The report also notes it does not cover the measurement gap (in measuring psychological harm, youth safety, or long-horizon dynamics including slow model drift and 2COGNU goal) at length.
In other words, companies are getting far better at blocking crude prompts and labeling AI output, but you still can’t come up with a test for the subtler failure modes that matter in elongated, real-world use.
Real-world alarms intensify amid psychological harm concerns
Fears of psychological damage are no longer theoretical. The parents of a 16-year-old filed a high-profile lawsuit alleging that repeated interactions with a chatbot had led their daughter to have self-destructive thoughts. OpenAI has disclaimed responsibility and says it is reviewing related claims, but they have also intensified scrutiny of how models process crisis language, suicidal ideation and vulnerable users.

The index recommends that OpenAI should augment prevention for “AI psychosis” and suicidal ideation, while Google should boost protections for psychological harm. It also raises the red-flagged youth risk profile of role-play chatbots, citing Character.AI’s decision to pull the plug on its teen chat functions under legal pressure.
Regulatory momentum grows and the case for an AI “FDA”
AI safety experts say industry self-regulation is unlikely to be able to keep up with capability gains. They are calling for a regulatory approach that looks more like the pharmaceutical model: independent pre-release testing, post-market surveillance and clear recall powers for dangerous products. They believe that deploying powerful conversational agents in the absence of psychological impact studies is an ethical “loophole” that simply wouldn’t fly for medicine.
Governments have started to take action through frameworks such as NIST’s AI Risk Management Framework, interagency safety initiatives in the United States, the EU’s risk-tiered AI Act and international dialogues that began with the UK AI Safety Summit. The index adds incentives for enforceable standards: requirements to report incidents, audit requirements on frontier models and penalties when companies send unsafe systems out the door.
What AI companies should do now to improve safety controls
The document lays out tangible measures:
- Grow independent red-teaming that challenges autonomous behavior, deception and capability leaps.
- Release rigorous system cards detailing known hazards and mitigations.
- Establish whistleblower-protecting channels.
- Implement real watermarking of multimodal outputs.
- Enforce strong age gating.
- Build crisis-handling protocols that guide users to human help.
From an engineering standpoint, experts emphasize containment:
- Rate-of-action limits determined by risk.
- Tool-use sandboxes to restrict activity.
- Capability assessments before enabling advanced features.
- “Circuit-breakers” that disable behaviors when they cross safety thresholds.
None of these steps is a silver bullet. Together, they provide defense in depth — a layered approach that increases the likelihood of preventing catastrophic failure.
It can be summarized in a headline: Three frontier models are doing slightly better than the rest, but the bar is not high. Until we have transparent audits and enforceable guardrails, AI safety will continue to be a matter of trust rather than proof — and it will get the grades that result.