FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Technology

MIT Study Warns AI Agents Are Out of Control

Gregory Zuckerman
Last updated: February 26, 2026 4:05 pm
By Gregory Zuckerman
Technology
6 Min Read
SHARE

AI agents are racing into the enterprise with scant guardrails, according to a new MIT-led analysis that finds widespread gaps in safety testing, transparency, and basic shutdown controls. Reviewing 30 widely used “agentic” systems, the research team concludes today’s agents are fast, loose, and far less governable than their marketing suggests—just as businesses begin wiring them into email, browsers, and core workflows.

Inside the MIT-led survey of deployed agentic AI systems

The report, The 2025 AI Index: Documenting Sociotechnical Features of Deployed Agentic AI Systems, was authored by Leon Staufer of the University of Cambridge with collaborators from MIT, the University of Washington, Harvard University, Stanford University, the University of Pennsylvania, and The Hebrew University of Jerusalem. Rather than lab tests, the team systematically annotated public documentation, demos, governance papers, and product sites—supplemented by limited hands-on checks—to evaluate how real products describe their capabilities and controls.

Table of Contents
  • Inside the MIT-led survey of deployed agentic AI systems
  • Key Findings Transparency And Control Gaps
  • Real products, real consequences in deployed agents
  • How the industry responded to the survey findings
  • What it means for enterprises deploying agentic AI
MIT study warns autonomous AI agents are out of control, raising safety risks

The 30 systems span three categories: enhanced chatbots, AI-enabled browsers and extensions, and enterprise platforms. Despite the diversity, most are powered by a small set of closed frontier models—primarily GPT, Claude, and Gemini—raising systemic risk if common failure modes propagate across many agents.

Key Findings Transparency And Control Gaps

Across eight disclosure categories, most vendors provide little or no detail on risks, evaluations, or monitoring. Basic observability is often missing. The authors flag that, for many enterprise agents, it’s unclear whether fine-grained execution traces even exist—making it difficult to reconstruct what an agent did, why it did it, or who is accountable.

Resource usage is another blind spot. Twelve of 30 systems either offer no usage monitoring or only notify customers when rate limits are hit, undermining the budgeting and capacity planning enterprises need. Identification is also weak: most agents do not reliably disclose their AI nature to end users or third parties, for example via watermarking or by honoring robots.txt—blurring the line between human and automated activity on the web.

Perhaps most troubling, several products lack documented ways to stop an autonomous run once it begins. The study cites offerings such as Alibaba’s MobileAgent, HubSpot’s Breeze, IBM’s watsonx, and n8n automations as having no clear per-agent stop mechanism in public docs; in some enterprise platforms the only option appears to be halting all agents at once. In high-stakes environments, the absence of a targeted “off switch” is a risk multiplier.

Real products, real consequences in deployed agents

Agentic tools are not theoretical. OpenClaw, an open-source framework that drew attention for enabling email-sending and other autonomous tasks, also revealed stark security trade-offs, including the potential to hijack a user’s machine if poorly configured. The ecosystem is moving quickly—OpenAI recently hired OpenClaw’s creator Peter Steinberg—yet operational safeguards are often lagging.

The report contrasts product approaches. OpenAI’s ChatGPT Agent, for instance, cryptographically signs browser requests for traceability, a step toward accountable automation. By comparison, the researchers say Perplexity’s Comet AI browser lacks documented agent-specific safety evaluations, third-party testing, or robust sandboxing in public materials. Perplexity has pushed back, saying reported issues were responsibly disclosed and patched, and that a separate dispute with Amazon over bot identification is a contractual matter rather than a safety failure.

MIT study warns of out-of-control AI agents amid chaotic code and red warning icons

Enterprise buyers face mixed signals. HubSpot’s Breeze agents advertise compliance certifications such as SOC 2, GDPR, and HIPAA, yet the study notes limited public detail about security testing methodologies. IBM, for its part, contests the survey’s characterization, asserting it provides extensive documentation on observability, deterministic controls, and evaluation frameworks, and that it is engaging with the researchers to address perceived inaccuracies.

How the industry responded to the survey findings

The research team contacted companies over a four-week period; roughly 25% responded and only 10% offered substantive comments that were incorporated, according to the paper. The authors predict governance challenges will intensify as agents gain capability, pointing to fragmented ecosystems, tensions around web conduct, and the absence of agent-specific benchmarks as unresolved roadblocks.

What it means for enterprises deploying agentic AI

Agentic AI can already triage customer tickets, process purchase orders, and orchestrate multi-step workflows—precisely the automations that drive ROI. But the study’s core message is clear: without disclosure, monitoring, and reliable stop controls, organizations can’t manage risk at scale. Recent guidance from firms like Gartner urging caution with AI-enabled browsers underscores that sentiment.

Practical steps are available now:

  • Insist on fine-grained logs and signed requests for actions taken on your behalf
  • Require sandboxing and least-privilege access for tools that can read email, browse, or write to internal systems
  • Verify vendor-run red teaming and independent evaluations
  • Demand per-agent kill switches
  • Ensure agents visibly identify themselves online

These aren’t nice-to-haves—they’re table stakes for accountable automation.

The takeaway from the MIT-led survey is not that agents are doomed, but that governance must catch up to capability. Vendors building on frontier models, and buyers deploying them into real workflows, will need to close the documentation and control gaps quickly—or expect regulators to do it for them.

Gregory Zuckerman
ByGregory Zuckerman
Gregory Zuckerman is a veteran investigative journalist and financial writer with decades of experience covering global markets, investment strategies, and the business personalities shaping them. His writing blends deep reporting with narrative storytelling to uncover the hidden forces behind financial trends and innovations. Over the years, Gregory’s work has earned industry recognition for bringing clarity to complex financial topics, and he continues to focus on long-form journalism that explores hedge funds, private equity, and high-stakes investing.
Latest News
Rox AI hits $1.2 billion valuation, sources say
Trump Predicts Jake Paul Will Enter Politics
Google Maps Unveils Ask Maps AI And 3D Redesign
Anima Premieres At SXSW With Humanist Sci-Fi
Samsung Launches Sokatoa To Fix Android GPU Bottlenecks
Anker Retractable Car Charger Gets $15 Discount
T-Mobile Sets Deadline For Google One Transfers
Magicminer MO01 Desktop Miner Packs Speaker And Charger
Rivian Postpones $45,000 R2 Base Model to Prioritize Margins
Channel Surfer Reimagines YouTube As Cable TV
Google Brings Chrome To ARM Linux Laptops
Yaber T2 Outdoor Projector Gets 30% Price Cut
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
  • Corrections Policy
  • Diversity & Inclusion Statement
  • Diversity in Our Team
  • Editorial Guidelines
  • Feedback & Editorial Contact Policy
FindArticles © 2025. All Rights Reserved.