FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Business

RadixArk Spins Out From SGLang At $400M Valuation

Gregory Zuckerman
Last updated: January 22, 2026 12:09 am
By Gregory Zuckerman
Business
6 Min Read
SHARE

RadixArk, a new commercial venture born from the popular open source project SGLang, has spun out with a valuation around $400 million, according to people familiar with the matter. The move underscores how fast the AI inference market is expanding as enterprises race to squeeze more performance and lower latency out of existing GPU fleets.

SGLang, used by teams at companies such as xAI and Cursor, has built momentum as an engine for running large language models more efficiently. Several core contributors have now shifted to RadixArk, positioning the startup to productize the technology while continuing to support the open source codebase.

Table of Contents
  • From Open Source Roots To Commercial Play
  • Betting On Inference Efficiency To Cut Costs
  • A Crowded But Expanding Market For AI Inference
  • Why A $400M Price Tag Might Add Up For RadixArk
  • What To Watch Next As RadixArk Builds Momentum
The RadixArk logo and slogan SHIP AI FOR ALL are displayed on a professional gray background with subtle horizontal stripes.

From Open Source Roots To Commercial Play

Like fellow inference project vLLM, SGLang traces its origins to Ion Stoica’s lab at UC Berkeley, a proving ground that also produced Databricks. The academic-to-startup path has become a pattern in AI infrastructure: build adoption in the open, then layer enterprise-grade operations, support, and SLAs.

Ying Sheng, a key SGLang contributor and former engineer at xAI, is leading RadixArk as co-founder and CEO. Sheng previously worked as a research scientist at Databricks, a background that fits RadixArk’s focus on production-scale systems. The company has raised angel capital, including support from veteran semiconductor and enterprise investors, according to people familiar with the financing.

RadixArk says it will continue to steward SGLang as an open-source AI model engine while developing commercial offerings. Those are expected to include managed hosting, enterprise support, and tooling that simplifies deployment across heterogeneous GPU clusters.

Betting On Inference Efficiency To Cut Costs

Inference—the phase where models generate outputs for real users—now dominates the operational cost curve for AI services. Nvidia has indicated that inference represents a growing share of data center GPU workloads, overtaking training for many customers. Every percentage point of throughput gained or latency reduced translates into immediate savings at scale.

SGLang’s appeal lies in techniques that are now table stakes for high-performance serving: continuous batching to keep GPUs saturated, paged attention to optimize memory, smarter key-value cache management, quantization-aware kernels, and speculative decoding to minimize wasted compute. In practice, these approaches can boost tokens-per-second and cut P95/P99 latency without new hardware—a compelling proposition when H100-class GPUs remain supply constrained.

Beyond raw serving speed, RadixArk is building Miles, a framework geared for reinforcement learning. The aim is to let customers adapt models in production using feedback loops—think RLHF refreshes, policy optimization for agents, and domain-specific skill tuning—while keeping inference costs predictable. That pairing of serving and continuous improvement could be a differentiator for enterprises that want models to get better with real-world use rather than periodic offline retrains.

A professionally enhanced image of an orange flame logo within an arch, set against a dark grey background with a subtle gradient and soft patterns, resized to a 16:9 aspect ratio.

A Crowded But Expanding Market For AI Inference

RadixArk enters a heated field. Forbes recently reported that the team behind vLLM has discussed raising roughly $160 million at a valuation near $1 billion. The Wall Street Journal reported that Baseten secured $300 million at a $5 billion valuation, while Fireworks AI raised $250 million at a $4 billion valuation late last year. The funding wave reflects investor conviction that the inference layer—model serving, routing, optimization, and observability—will be a durable part of the AI stack.

Hardware trends reinforce the thesis. As LLMs swell in parameter count and context windows stretch into hundreds of thousands of tokens, serving becomes a memory and scheduling problem as much as a raw FLOPS problem. Vendors that deliver GPU-agnostic acceleration across Nvidia’s H100/H200/B200 and AMD’s MI300-class hardware, while handling quantized and mixture-of-experts models, will have an advantage as customers seek portability and cost control.

Why A $400M Price Tag Might Add Up For RadixArk

Valuations at seed and early growth for AI infrastructure companies increasingly reflect strategic positioning rather than current revenue. RadixArk owns several levers that investors prize: an active open-source community, demonstrated performance on widely used open models, and a pathway to enterprise contracts that promise immediate ROI via cost-per-token reductions. For buyers spending millions monthly on inference, a 20–30% efficiency gain can be budget-changing.

If RadixArk can convert SGLang’s adoption into enterprise subscriptions—offering uptime guarantees, privacy-preserving on-prem deployments, robust telemetry, and workload-aware autoscaling—the company could tap the same tailwinds lifting peers. The key will be translating benchmark wins into predictable savings in messy, multi-tenant real-world environments.

What To Watch Next As RadixArk Builds Momentum

Keep an eye on RadixArk’s head-to-head benchmarks on Llama 3, Mixtral, and other MoE models; support for quantization schemes like AWQ and GPTQ; and performance under long-context workloads where KV cache strategies become critical. Enterprise readiness—role-based access, audit logs, data residency controls—and support for hybrid GPU fleets will also signal how quickly RadixArk can move from open source traction to commercial scale.

The inference market is expanding fast, but it rewards execution. If RadixArk can turn SGLang’s technical edge into lower tail latencies, higher throughput, and simpler operations across varied hardware, the $400 million starting line may look conservative in hindsight.

Gregory Zuckerman
ByGregory Zuckerman
Gregory Zuckerman is a veteran investigative journalist and financial writer with decades of experience covering global markets, investment strategies, and the business personalities shaping them. His writing blends deep reporting with narrative storytelling to uncover the hidden forces behind financial trends and innovations. Over the years, Gregory’s work has earned industry recognition for bringing clarity to complex financial topics, and he continues to focus on long-form journalism that explores hedge funds, private equity, and high-stakes investing.
Latest News
Amazon Offers $350 Off Shark AV2501AE Robot Vacuum
US Semiconductor Market 2025 Sees Upheaval
Portable $22 Cable Charges Almost Everything You Own
Sennheiser Launches RS 275 TV Headphones With Auracast
X Unveils Starterpacks Mirroring Bluesky Follow Guides
Todoist Launches AI Voice Task Creation
Apple Readies Major Siri AI Chatbot Revamp for iOS 27
Anthropic Revises Claude Constitution Hints At Sentience
Apple Developing AI Pin After Humane Stumble
Microsoft Office 2021 And Windows 11 Pro Bundle Drops To $40
Amazon Slashes Satisfyer Prices Up to 50%
Researchers Claim Gemini Leaked Calendar Data
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
  • Corrections Policy
  • Diversity & Inclusion Statement
  • Diversity in Our Team
  • Editorial Guidelines
  • Feedback & Editorial Contact Policy
FindArticles © 2025. All Rights Reserved.