AI labs hungry for domain-grade training data are quietly turning to Mercor, a fast-growing marketplace that pays former bankers, consultants, and attorneys to re-create the workflows their employers won’t share. The pitch is simple: when enterprises balk at handing over sensitive records, hire the people who know the process cold and have them generate structured examples, reports, and edge cases to teach models how real work gets done. Mercor’s clients include prominent labs like OpenAI, Anthropic, and Meta, according to industry briefings. The company says it pays experts up to $200 an hour, disburses more than $1.5 million to contractors daily, has tens of thousands of contributors, and has grown to roughly $500 million in annualized recurring revenue with a recent valuation around $10 billion. The economics underscore a new reality: top-tier data is scarce, and labs will pay a premium to get it.
Why AI labs need Mercor’s data workarounds for training
Enterprises in finance, law, and healthcare are wary of letting external models train on client files, internal chat logs, or procedural manuals. Legal exposure, data governance, and competitive dynamics all get in the way. A bank may see little upside in helping a general-purpose model learn how to replace parts of its own value chain.

- High-quality, rights-cleared data: HAI at Stanford and other research groups have dubbed access to high-quality, rights-cleared data the primary bottleneck for the next generation of AI. Concurrently, many management consultancies, like McKinsey, theorize that a significant proportion of knowledge-work practices can be automated—competition over who gets to control the training corpus matters more than ever. But when 70% of the worth of a workflow exists in its unspoken stages and unusual side scenarios, synthetic yet real-life examples authored by veterans can improve model clarity more than raw size ever will.
Inside Mercor’s expert marketplace for AI training data
Mercor transforms professional, practical knowledge into machine-readable training packs. Perhaps a former banker fills out a few “deal room” forms, defines redline rationales on clauses of term sheets, and then develops scenario-driven memos that reproduce true diligence. Perhaps a counsel drafts precedent-engaging motions, including reasoning pathways, then hits and scraps model results against a firm-specified checklist. Aside from classic labeling, the goals are varied: authors must pen detailed reports, tally contract restrictions, and suggest breakdown functions that confuse the agency cooperating with instruments.
In conclusion, this creates a vicious cycle: labs send a model, Mercor’s experts crumple it off credible takeoffs, then yield tightened data that span the capacity hole. That expert-in-the-loop design has stretched to newcomers like Scale AI and Surge as well, introducing “conditions” where agencies are required to familiarize themselves with authentic duties.
The legal and ethical tightrope for expert-made data
Mercor tells prospective contributors not to upload employer documents and to operate based on personal knowledge and publicly accessible information. Nonetheless, the difference between “know-how” and “trade secrets” is often blurry. The U.S. Trade Secret Act and the Economic Espionage Act protect nonpublic material that is valuable because it is kept “secret”; more limitations come from NDAs and confidentiality provisions. Enforcement on a broad scale is difficult in practice, particularly if some contributors have full-time jobs. Compliance programs may scan for metadata and invite contributors to restate or refurbish material, but the chance of confidential patterns slipping through is never truly zero unless every result is independently audited.

Many corporate security divisions are currently modifying procedures and instructions to notify employees that “derivative reconstructions” may still transgress their agreements. Mercor’s posture is strikingly similar to a well-known school of talent jurisprudence: the information in a worker’s head is the property of the worker, but a written report or dataset is the possession of the corporation. It might be a legitimate line of thinking in numerous cases, but courts usually evaluate how closely a final result resonates with guarded work. As such platforms expand, look for more clear-cut rulings from regulatory bodies as well as the judiciary.
A crowded field and shifting alliances in AI data
Mercor is recognized for propelling the idea that if one pays U.S.-based experts in a domain to practice models, the classroom should include crowd labor. This design gained a great deal of momentum when traditional labeling resource pipelines developed gaps. Market talk indicates a lab’s Scale AI exposure has diminished due to competitive fears after one of the significant platforms it has collaborated with.
Even with momentum on revenue, Mercor trails larger incumbents, with most valuations above $20 billion. However, its leadership hires signal ambition: the ex-chief product officer of Uber was hired as president, and its executives characterize the marketplace as the second wave of the gig economy, this time for white-collar work. The model is capital-intensive — experts aren’t inexpensive — but the incentive for labs to compete for differentiated data has kept net earnings robust. For banks, firms, and hospitals, the rise of expert marketplaces will create a strategic option. One is omnipresence. Tighten data policies, examine side gigs, and develop internal model programs to meet the advantage. Incumbents are also trying to resist, with some claiming that managing the training process is preferable to ignoring it. The winners of whether general-purpose assistants continue their growth will be those who mix sensitive data and institutional judgment with competent management. In conclusion, AI labs require actual workflows to achieve professional-powered performance, and Mercor has converted that need into a category of organized experience.