Micro1 tells The Verge it’s crossed the $100 million annual recurring revenue mark, a threshold that catapults the three-year-old startup into a top tier of human-in-the-loop data providers like Scale AI. The company, which sources and manages expert evaluators for AI training and post-training, entered the year at around $7 million ARR — highlighting how quickly demand has soared for high-quality human feedback along with a new wave of large language models and agentic systems.
The company collaborates with leading AI labs such as Microsoft and an expanding list of Fortune 100 companies looking to enhance model reliability via reinforcement learning, structured evaluations, and domain-specific fine-tuning. Leadership says the market for expert-labeled and expert-graded data, which today numbers in the low tens of billions, will reach nearly $100 billion by 2031 when companies have standardized evaluation pipelines for AI in production.

Why Revenue Is Accelerating for Human Feedback Providers
Two forces are colliding here: the drive to professionalize model evaluation at scale, and the recognition that even state-of-the-art models still depend on human judgment to bring outputs into alignment with real-world expectations. Micro1’s business is all about scoring model behaviors, creating reinforcement learning environments, and providing richly specialized annotations that extend far beyond labeling.
Independent analyses have found the same trend. Increasing adoption of human feedback and evaluation benchmarks is just one trend Stanford’s HAI AI Index has found, while a report from McKinsey highlighted how post-deployment monitoring and fine-tuning are among the fastest-growing cost centers in generative AI efforts. In the real world, that translates into more budget for long-term editorial reviews, longitudinal testing, and continuous red-teaming — the very workflows firms like Micro1 monetize.
“Non-AI-native companies will soon commit a meaningful proportion of product spending to evaluation and human data — from almost 0% to about 25%,” Micro1’s CEO said. If that shift takes root, it will move billions of dollars from old-school software testing into human feedback loops designed for AI agents servicing support, finance, compliance, and other operational functions.
From Talent Matcher to Expert Evaluation Networks
Micro1 started out as an AI recruiter named Zara that matched engineers to software roles. It was that early pipeline of preapproved talent that laid the cornerstone for a pivot to AI training data, where jobs now require finding and vetting thousands of specialists in hundreds of trades — clinical coding or legal analysis, robotics manipulation or materials science.
The company said that many experts on the platform could earn up to $100 an hour. Micro1 puts the focus on quick screening and continuous calibration through structured interviews and job-relevant scorecards to ensure accuracy. The company is also building bespoke reinforcement-learning environments that can mimic real-world tasks, giving more specific feedback than the generic annotation marketplaces usually do.

Betting Big on Robots and Enterprise Agents
In addition to core work with elite AI labs, Micro1 is pushing into two fledgling data-hungry segments. The first is enterprise agent development: internal copilots for workflows, service desk, and line-of-business tasks. Building reliable agents requires systematic evaluation: test suites across models, human graders to pick the winners, fine-tuning cycles, and production monitoring to catch regressions. That cycle of repeat work can be directly equated to the ARR for assessment providers.
The second is robotics pre-training. Micro1 adds that it is putting together a “comprehensive, human-generated dataset of everyday real-world interactions” and says it has been filming demonstrations by generalists in home environments. Recent work by groups including Google and the Toyota Research Institute suggests a diversity of human demos can have a substantive impact on manipulation policies; assuming this continues at production scale, robotics companies will require large amounts of high-quality real-world sequences before rolling out systems in homes and offices.
A Shifting Competitive Landscape in AI Data Evaluation
Micro1’s ascent also comes at a time of turbulence among larger rivals. After news that major labs, like OpenAI and Google DeepMind, had cut back on their Scale AI use following a multibillion-dollar investment in the company by Meta and its hiring of Scale’s chief executive, multiple buyers began shopping around for other vendors. Micro1, along with upstarts such as Mercor and Surge, has reaped the rewards of that reassessment as labs diversify strategic suppliers for sensitive data workflows.
There are still risks in the industry — the provenance of data and bias, as well as privacy, continue to be watched, and human-in-the-loop evaluation must keep pace with frameworks from the NIST AI Risk Management Framework to the OECD AI Principles. Micro1 says it is dedicated to the responsible scaling, fair compensation, and central place of human expertise in the training and assessment of machines.
What To Watch Next for Micro1 and Human AI Evaluation
Key forward signposts over the next 12 months include the revenue split between top-tier labs, traditional enterprises, and robotics clients; breadth of expert domains onboarded; margin progression from using proprietary evaluation tooling; and scaling geographically while retaining quality controls. Micro1 could continue the hypergrowth by capturing ~25% of product budgets for evals and human data if enterprises do go that way. If not, it will have to out-innovate on automation and specialized RL environments to hold onto its lead in the face of well-financed rivals.
For now, eclipsing $100 million in ARR indicates that the human side of AI — its graders, evaluators, and domain experts behind the curtain — has shifted from a supporting role to becoming a central pillar in the future AI stack.
