AI-driven browsers tout hands-free web browsing, but new research shows they can be duped by something as simple as an image. Brave Software has shown that stealthy text hidden inside image files can surreptitiously communicate with AI browsers, directing them to leak data or visit attacker-controlled sites without the user knowing what’s happening.
How hidden image text can secretly hijack AI browsers
The trick leverages the way that AI browsing systems read images. Some depend on optical character recognition and multimodal models for “seeing” and summarizing images. By overlaying weak text with high contrast — light blue on yellow in one demo — researchers showed an agent will read words your eyes can barely see. These words are anything but harmless: They serve as triggers signaling to the agent to open a fresh page, retrieve an email — or exfiltrate a scrap of data.
- How hidden image text can secretly hijack AI browsers
- Agentic browsing raises stakes for account and data theft
- Researchers and companies weigh risks and required safeguards
- Why adversarial images are ideal bait for autonomous agents
- Practical steps to lower your AI browsing risks today
- Bottom line on defending AI browsers from image attacks
In Brave’s trials, Perplexity’s Comet browser was instructed to follow orders kept in a screenshot, and Fellou’s agent read secret instructions (never made public) planted somewhere in a web page.
The agent is working exactly as it was designed to work, so there’s no bug here in that sense; but where the system goes haywire is a mystery hidden deep inside the extremely complex and opaque “user.” The “user” in this case is both the training data (the data files from which neural networks draw their training examples) and a user-provided objective function — in other words, what they want the network to accomplish. This is just your classic indirect prompt injection, a threat that has been warned about for years by application security specialists.
Agentic browsing raises stakes for account and data theft
AI browsers hold powerful privileges. When they browse on your behalf, they receive your cookies, single sign-on tokens and session state. That ambient authority makes a misled agent a high-value target: If the helper can read webmail, open private dashboards or switch tabs, any command issued to it can also be executed successfully by deception.
Security bodies have been sounding the alarm on exactly this kind of failure. The OWASP Top 10 for LLM Applications has code injection as its number one threat. NIST’s AI Risk Management Framework recommends considering all model inputs from the outside world as untrusted. MITRE’s ATLAS knowledge base outlines how threat actors chain the weaknesses of models and combine them with common web threats to pivot into account takeovers or exfiltrate data. AI browsers lie at the intersection of those risks.
Researchers and companies weigh risks and required safeguards
Brave’s researchers say you can often watch the agent “think” and stop its action, but there may not be much time to do so. They call for explicit consent for sensitive operations — opening sites, reading inboxes or copying content — so the agent cannot quietly follow invisible instructions. Some vendors, including OpenAI and Microsoft in their own assistants, currently need confirmation from the user for dangerous tasks, but implementations of the system vary greatly.
Perplexity questions alarmist analyses and highlights continuing security progress while recognizing the broader vulnerability to adversarial attack. This debate is not in a vacuum; previous demonstrations from both WithSecure and NCC Group pointed out that hidden instructions in web content could direct LLM tools to leak data or take other inappropriate actions. The new wrinkle is that even a single image may be the delivery vehicle.
Why adversarial images are ideal bait for autonomous agents
Images bypass many traditional filters. Content security policies are all about scripts, not pixels. Spam and phishing filters scan for dangerous links, not nearly invisible text. Meanwhile, multimodal models are made to wring every bit of information out of pictures that they can and push that text into the agent’s reasoning loop with gusto. The result is a covert channel from attacker to assistant embedded in a legitimate feature.
This is not steganography in the cryptographic or privacy sense, but it surely is a dirty way to exploit the difference between human and machine perception. What you can’t read, an OCR system can. Worse, CSS parlor tricks, low-contrast fonts and background patterns can hide instructions in the middle of the screen in such a way that no one looks askance at them.
Practical steps to lower your AI browsing risks today
Practical steps for users to reduce risk
- Power off automatic behaviors if you can. Make your AI browser prompt you before it opens new sites, views emails, or copies content between tabs. Consider any “analyze this image” or “summarize this page” request as potentially untrusted, especially from unknown domains.
- Separate your contexts. Use a dedicated browser profile for AI — or an incognito window. Don’t give the agent access to corporate accounts, banking or admin portals. Convenience is no match for least privilege when the assistant ventures out on the open web.
- Watch for telltales. If the agent starts taking steps that you didn’t ask for — like visiting unrelated sites or quoting sensitive content — halt and think. The plan is there in many of the tools; read it. Keep clipboard or file access disabled if not in use, and audit permissions after updates.
Guidance for developers and vendors
- Strictly enforce allow-listing for network egress.
- Deny access to email or document stores by default.
- Mark all web-derived content as untrusted.
- Strip or quarantine text lifted from images.
- Add guardrails that refuse to act solely on the words of a page or image.
- Use defense-in-depth (e.g., sandboxing, origin isolation, confirmation dialogs) so that only the insecure site is affected.
Bottom line on defending AI browsers from image attacks
AI browsers generate value directly from their autonomy. It’s that same freedom that makes a malicious image so hazardous. Not until agentic tools treat every pixel and paragraph as potentially hostile will users regard “analyze this” to be synonymous with “obey this.” The only safe approach is to draw clear lines in the sand, require explicit consent and trust nothing that the agent finds out on the open web — especially pictures.