The Laude Institute has named the first group of recipients for its new Slingshots AI grants, which aim to propel innovative research from the lab to life on planet Earth. It’s less of a traditional grant, more of a technical accelerator for academics and individual researchers: funding coupled with access to compute, engineering support, and product feedback. In exchange, awardees pledge to deliver some concrete output — whether it’s an open-source codebase, a benchmark, or even a company.

What the Slingshots Program Provides to Researchers

The institute positions Slingshots as a way of connecting the most perilous gap in AI: taking promising enough ideas to make them into reliable, reproducible systems. There also are resources beyond the research dollars that recipients receive — including time on the most modern GPU clusters, hands‑on help from seasoned engineers, and assistance with evaluation, deployment, and documentation — all of which are in short supply for many academic institutions. The model is designed for speed and accountability, with milestones that lead to a public artifact instead of a paper alone.

Table of Contents

What the Slingshots Program Provides to Researchers
Early Projects Focus on Evaluation and Agents
Why Better Benchmarks Matter for Real-World AI
Research Directions Beyond Benchmarks and Efficiency
Where the AI Money Is Going in Grants and Research
What Comes Next for Slingshots and Its First Cohort

Early Projects Focus on Evaluation and Agents

A few of the selected projects zoom in on one of AI’s thornier problems: how to measure actual ability. Code Formula, from researchers at Caltech and UT Austin, is focused on how AI agents work to optimize existing software, not just churn out new snippets. It’s an important distinction for industry teams, where winning tricks tend to be more about refactoring and micro‑optimizations rather than greenfield code.

Another grant supports BizBench, a Columbia‑based project to benchmark “white‑collar AI agents” within realistic business workflows. While favorite exercises like MMLU assess knowledge and reasoning in isolation, BizBench targets large, competitive tasks — think writing a report with required data collection on a time budget: CRM updates or purchasing comparisons — that are performed over multiple steps, under time and accuracy pressures. If you do a good job of it, companies could have more signals when deciding whether to adopt agentic systems for daily operations.

Also in the class is CodeClash, co‑founded by SWE‑Bench’s John Boda Yang. Motivated by the fruitful means of SWE‑Bench in terms of bug‑fixing measurement, CodeClash introduces a dynamic competition scenario where tasks change and models need to adapt. That adversarial environment could help alleviate the threat of benchmark overfitting and demonstrate whether code agents can adapt to changing requirements, something current static test sets frequently overlook.

Why Better Benchmarks Matter for Real-World AI

As labs jockey to send out bigger models and agent frameworks, the field as a whole still suffers from a lack of common yardsticks for complex, real‑world tasks. Stanford’s Center for Research on Foundation Models has pushed (via HELM) that the amount of evaluation being conducted is crucial to trust, and MLCommons’ MLPerf has demonstrated how benchmarking can influence hardware and software roadmaps. Since then, high‑quality open benchmarks — especially for agents — have helped teams move from demos to more reliable and comparable performance claims.

Three professionals in a data center setting, with two looking at a computer screen displaying AI EVALUATION and performance metrics, while the third person presents a document.

And the timing coincides with corporate caution. Private investment is growing even as organizations demand more evidence of ROI and safety, according to the latest AI Index report from Stanford. Performance norms that match the real world can speed up procurement decisions, as well as reduce the waste of expensive projects that do not make it out of the lab.

Research Directions Beyond Benchmarks and Efficiency

Not all Slingshots awards are products to be tested either. The institute spotlighted new reinforcement learning projects — an area of renewed interest as teams try to train for stable agentic behavior. Others address model compression and distillation, which are crucial for on‑device AI as well as cost-cutting inferencing in production. As international scrutiny on data center energy use grows, efficiency‑first research has never been merely academic; it’s a prerequisite for scale.

Where the AI Money Is Going in Grants and Research

Slingshots falls at the nexus of philanthropic and translational research. It follows similar undertakings like the U.S. National Science Foundation’s AI Institutes and philanthropic programs from businesses like Schmidt Futures and Open Philanthropy, but with a sharper focus on shipping artifacts rapidly. Through its combination of compute horsepower and engineering muscle, Laude is wagering that it can shrink the time from new idea to widely useful tool.

What Comes Next for Slingshots and Its First Cohort

Laude says recipients who don’t do solo or open-source releases when possible will be expected to publish results at the end of their grant period. For industry observers, the litmus tests will be whether Code Formula, BizBench, and CodeClash emerge as de facto standards — and whether the program’s reinforcement learning and compression projects yield measurable efficiencies. If this first group does land, Slingshots may be a model for how to fund AI research that ships.