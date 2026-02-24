Anthropic has publicly alleged that three China-based AI labs orchestrated industrial-scale “distillation” campaigns to siphon the capabilities of its Claude models. The company says the groups used roughly 24,000 fraudulent accounts to generate more than 16 million exchanges with Claude, activity Anthropic frames as a mounting security and competitiveness risk for the AI sector.

The claims land at a moment when large language model providers are spending heavily on training and infrastructure while increasingly relying on paid APIs for revenue. If rivals can cheaply replicate model behavior through sustained querying, Anthropic argues, the economics and safety posture of frontier AI begin to wobble.

What Anthropic Says Happened in the Distillation Campaigns

In a detailed account, Anthropic identified three firms—DeepSeek, Moonshot, and MiniMax—as running coordinated efforts to extract Claude’s behavior at scale. The company says the operators masked their origins behind networks of throwaway accounts and proxies, bypassed regional access controls, and violated terms of service designed to prevent automated scraping.

Anthropic characterizes the activity as accelerating in both volume and technical sophistication and warns that the window for an effective industry response is narrowing. The company is calling for collaboration among AI providers and government agencies to detect and disrupt similar campaigns before they become a baseline cost of doing business.

How Distillation Attacks Strip a Model’s Know-how

Knowledge distillation is a legitimate machine learning technique in which a powerful “teacher” model guides a smaller “student” model to mimic its outputs. Frontier labs use it routinely to create compact, cheaper variants for customers. The same idea, pointed at a competitor’s API, becomes a model extraction attack: adversaries issue large volumes of carefully varied prompts, observe responses, and train their own models to reproduce the target system’s behavior.

When done at scale—as Anthropic alleges with 16 million exchanges—the result can approximate not just surface-level answers but also decision patterns and safety behaviors. Academic work dating back to 2016 from researchers at EPFL and Google showed that black-box model extraction via APIs is feasible across vision and NLP tasks, and frameworks like MITRE ATLAS now explicitly catalog such tactics for AI systems.

The draw is obvious: faithfully replicating a state-of-the-art model’s capabilities can slash compute requirements, data collection costs, and time-to-market. Even partial success can yield a fine-tuning dataset of high value, compressing years of R&D into months of targeted harvesting.

A Growing Pattern Across AI Labs and API Endpoints

Anthropic’s disclosure follows earlier allegations from OpenAI that DeepSeek engaged in similar distillation attempts. Together, the claims suggest a broader pattern rather than a one-off probe. With global AI spending running into the tens of billions annually for chips, data centers, and training runs, the incentive to shortcut capability development is stark.

API endpoints have become the soft underbelly of frontier AI: they expose model behavior to the open internet while monetizing access. Attackers can blend in with normal traffic, distribute queries across thousands of accounts, and tune prompt families to maximize learning yield. The anonymity and elasticity of cloud resources make such operations both scalable and hard to attribute without sophisticated detection.

The Legal Gray Zone And Industry Options

While Anthropic cites clear violations of its terms and geofencing rules, the legal picture for model extraction remains murky. U.S. trade secret law hinges on maintaining secrecy and demonstrating misappropriation, which can be difficult when access occurs through public APIs. Computer misuse statutes are likewise complex when attackers use valid credentials, even if fraudulently obtained. In practice, providers often default to account termination, civil actions, and referrals to regulators.

There is, however, a growing playbook for defense. Companies are tightening know-your-customer checks, enforcing stronger velocity and credit limits, and using device fingerprinting and graph analysis to spot coordinated account farms. Some labs experiment with output-level signals—such as semantic watermarks or canary phrases—to trace distillation datasets, though these methods face evasion risks and quality trade-offs.

Governments are also circling the issue. U.S. agencies like the Department of Commerce, CISA, and NIST have pushed AI risk management and supply-chain guidance, while export controls on advanced chips aim to slow adversarial capability building. None of these measures directly solves model extraction via public APIs, which is why Anthropic is urging rapid, collective action and threat intelligence sharing akin to information-sharing and analysis centers used in cybersecurity.

The bottom line is uncomfortable for everyone: distillation is both a cornerstone of modern AI and an avenue for capability theft. As providers race to harden APIs and share indicators of abuse, the sector will be judging not just model benchmarks, but also the resilience of the pipelines that protect the IP behind them.