Box is leaning hard into agentic AI, and CEO Aaron Levie has a crisp thesis for why: the next wave of productivity will be unlocked not by bigger models alone, but by models steeped in the right context. In his view, the real frontier is connecting AI to the sprawling, messy troves of enterprise content—contracts, slides, PDFs, design assets—and doing so with precision, governance and repeatability.
At Boxworks, the company introduced Box Automate, a system that orchestrates AI agents across business processes, breaking complex work into bounded steps. It’s a pragmatic response to a hard truth Levie repeats often: there’s no free lunch in AI. Compute is finite, context windows are limited, and enterprises can’t afford agents that go off script.

Why context is eclipsing raw scale
Most corporate automation to date has lived in structured data—CRM, ERP, HRIS. The bigger prize now sits in unstructured content. IDC estimates that roughly 90% of enterprise data is unstructured, which explains why legal reviews, marketing asset approvals and M&A diligence still depend on human judgment and manual handoffs.
Levie’s bet is that agents become useful when they’re fed precise, permissioned context drawn from files, metadata and activity histories. Box Automate segments workflows into stages—intake, classification, review, redline, approval—so each agent gets only the context it needs for its bounded task. That keeps prompts lean, cuts the chance of model drift, and reduces wasted tokens.
Even the most advanced systems hit the wall when tasks sprawl. Long-running agents exhaust context windows, lose state, or compound small errors. Splitting work into sub-agents with explicit handoffs—plus retrieval steps that refresh relevant content—mitigates those failure modes without waiting for the next model size bump.
Guardrails, determinism and trust
Enterprises want repeatability as much as intelligence. Levie frames the design choice as deciding where to be deterministic and where to allow agentic flexibility. Deterministic guardrails define who can trigger an agent, which repositories it can search, the confidence thresholds for extraction or summarization, and when a human must review.
This aligns with emerging AI risk practices. Frameworks from NIST and market guidance from Gartner emphasize policy enforcement, monitoring, and human-in-the-loop for high-impact steps. In Box’s approach, you might separate a “submission agent” from a “review agent,” introduce acceptance tests at each checkpoint, and log every prompt, response and source document for auditability.
Security and permissions as the moat
Context without control is a liability. The biggest failure pattern in early enterprise AI rollouts has been permissive retrieval—agents answering with content the user shouldn’t see. Levie argues Box’s decades of work on identity, permissions and compliance now serve as an AI advantage: enforcement of least-privilege access is native to the content layer, not bolted on around the model.
Practically, that means retrieval steps and vector search respect the same ACLs as the underlying files. It also means policy, retention, legal hold and DLP signals travel with the content as agents work. For regulated industries, controls mapped to frameworks like ISO 27001, SOC 2 and HIPAA provide the governance baseline AI needs to scale safely.
Neutral platform in the model race
Foundational model providers are moving up the stack with native file tools. Box’s counter is to be the connective tissue: storage, security, permissions, embeddings and orchestration that plug into multiple leading models. That lets customers pick models by task—fast and cheap for bulk extraction, more capable for complex analysis—and switch as cost-performance curves shift.
For CIOs, this matters because AI spend tends to balloon as pilots become production. Levie’s “no free lunch” also means no single-model lock-in; routing across providers, caching results, and clipping context to what’s essential are now table stakes for controlling unit economics.
Where agents help today
Consider NDA intake. A submission agent classifies the document, extracts entities and clauses, and checks against playbooks. A review agent proposes redlines aligned to policy. A final agent drafts a summary and pushes status back to the deal system. Each step uses only the relevant contract library, templates and permissions, minimizing hallucinations and speeding cycle time.
Marketing teams can run similar patterns for asset approvals—auto-tagging creative, validating claims against approved messaging, and routing exceptions to legal. In due diligence, agents highlight anomalies across data rooms and produce source-linked briefs. Early enterprise pilots often report double-digit time savings on drafting and research; McKinsey estimates generative AI could contribute trillions in annual value when embedded in such knowledge workflows.
The CIO playbook for the era of context
The lesson in Levie’s framing is straightforward: AI performance is now a function of content quality and control. Organizations that normalize formats, clean permissions, enrich metadata and standardize taxonomies will see better agent outcomes than those that simply plug a model into a messy repository.
Measure what matters—time to decision, exception rates, retrieval precision, cost per completed task—and tune workflows by shrinking or expanding agent autonomy where results demand it. In this era of context, the edge belongs to teams that treat content governance not as overhead, but as the engine of AI effectiveness.