Google says it disrupted a large-scale attempt to clone its Gemini AI, revealing that one coordinated campaign hammered the model with more than 100,000 prompts to reverse-engineer its behavior. The company detected the activity in real time, blocked associated accounts, and tightened safeguards designed to keep proprietary reasoning methods out of reach.
What Google Detected in the Coordinated Cloning Campaign
The campaign, described in a new report from Google’s Threat Intelligence team, aimed squarely at Gemini’s ability to reason through complex tasks. Attackers repeatedly prodded the model to surface internal decision patterns that are normally hidden from users, attempting to capture enough signal from outputs to train a lookalike system. According to Google, protective systems flagged the anomalous querying behavior, suppressed exposures tied to internal reasoning, and led to account takedowns.
- What Google Detected in the Coordinated Cloning Campaign
- Inside a Model Extraction Playbook for Stealing LLMs
- Why Step-by-Step Reasoning Is a Prime Target for Attackers
- Who Is Behind the Probing of Gemini and Why It Matters
- More Misuse Beyond Cloning: Phishing, Malware, and Abuse
- The Defensive Playbook Emerging to Counter Model Theft

Notably, the operators did not need to breach Google’s infrastructure. They relied on legitimate API access and scale. That volume—100,000-plus prompts—suggests a methodical sweep across tasks, languages, and formats to map how Gemini generalizes, a hallmark of model-extraction campaigns.
Inside a Model Extraction Playbook for Stealing LLMs
This tactic, often called distillation or model stealing, is well documented in security research. A classic study by Cornell and EPFL researchers demonstrated years ago that prediction APIs can be mined to reproduce a model’s decision boundaries. The modern twist with generative AI is that rich, free-form outputs can be harvested as high-quality training data for a rival system, compressing R&D timelines and costs without touching the original model weights.
At today’s typical LLM pricing, issuing 100,000 prompts can cost in the low thousands of dollars—well within reach for startups or well-resourced actors—yet the potential upside is enormous if the stolen behaviors approximate a frontier model. That risk has pushed providers to deploy rate limiting, query anomaly detection, and stricter terms of service that explicitly ban output harvesting for model replication.
Why Step-by-Step Reasoning Is a Prime Target for Attackers
Attackers focused on coaxing out step-by-step reasoning, sometimes known as chain-of-thought. Many vendors intentionally avoid exposing these intermediate steps because they can leak intellectual property about how the model organizes knowledge and solves problems. Capturing enough of that signal—especially across math, coding, and multilingual tasks—can help a copycat system emulate the “how,” not just the “what,” of Gemini’s answers.
Google says it hardened defenses to curb inadvertent disclosure of those internal steps while preserving answer quality. That balancing act is central to today’s AI security: preserve utility and transparency for users, but frustrate systematic attempts to siphon off the engine’s most valuable behaviors.

Who Is Behind the Probing of Gemini and Why It Matters
Google did not attribute the campaign to a specific entity, but it characterized most recent extraction attempts as originating from private companies and researchers seeking an edge. John Hultquist, chief analyst at Google’s Threat Intelligence unit, has warned in media interviews that as more organizations build custom AI tuned on sensitive data, copycatting pressures will intensify across the sector.
That aligns with broader industry dynamics. High-performing models confer outsized competitive advantage, and the line between “benchmarking a rival” and “training on a rival’s outputs” can blur quickly. Security teams now treat output harvesting as a first-class threat, not just a policy violation.
More Misuse Beyond Cloning: Phishing, Malware, and Abuse
Google’s report also flags experiments by threat actors using generative AI to refine phishing lures and to write malware components on demand by calling model APIs. In each case, the company says it disabled implicated accounts and updated safeguards. Those moves mirror broader guidance from organizations like NIST and MITRE’s ATLAS knowledge base, which catalog AI-enabled tradecraft and recommend layered detection and response.
The Defensive Playbook Emerging to Counter Model Theft
Providers are converging on a stack of controls to blunt model theft:
- Strict rate caps and burst detection
- Behavioral fingerprints for automated scraping
- “Canary” prompts to catch data siphoning
- Randomized output strategies that make distillation less consistent
- Automated audits to spot unusual prompt patterns across tenants
Many are also adding provenance controls and legal levers that prohibit training on generated outputs.
The takeaway is clear: large language models do not have to be hacked to be stolen. Output-level defenses and real-time telemetry are now as critical as traditional perimeter security. Google’s disruption of a 100,000-prompt campaign illustrates both the scale of the threat and the maturing toolkit needed to keep flagship AI systems proprietary.
