Polars, the fast-growing Amsterdam-based startup behind a new open source DataFrame engine of the same name, has raised €18 million (approximately $21 million) in a Series A round led by Accel, with participation from Bain Capital Partners and an array of angels. It will use the new round to speed up development of Polars Distributed, its new multi-node engine, and extend Polars Cloud, the company’s managed service targeting teams that want performance without the infrastructure hassle.
Developed by Ritchie Vink in the early days of the pandemic as a Rust-based answer to slow Python data tooling, Polars garnered fans among data scientists and analytics engineers aiming for order-of-magnitude speed-ups on laptop-scale datasets. The company says the project has grown to include over 24 million downloads and adoption across industries like finance, life sciences, and logistics.
Why Polars Is Gaining Developer Mindshare
Polars combines a columnar executor implemented in the Rust programming language, along with lazy evaluations and query optimizations inspired by the Apache Arrow memory format, to reduce data copies. In practice, that leads to lower memory use and faster group-bys, joins, and aggregations — especially on wide tables and complex transformations that bog Pandas down. Community benchmarks and write-ups from engineers at outfits like AWS and Cloudflare have drawn attention to Rust’s memory safety advantages and high-performance capabilities, upon which Polars is well grounded.
Beyond raw speed, ergonomics matter. Notebooks have become popular drivers for Polars, which exposes an idiomatic DataFrame API and makes it clear how to write expressive lazy queries, with support for native Parquet/Arrow methods. That “close to the metal, friendly at the top” approach allows teams to continue using both Python and Arrow-based files while removing the worst bottlenecks.
From Pet Project to a Sustainable Commercial Platform
What started as a pandemic side project is now a company with a product roadmap. Following a seed round of $4 million, led by Bain Capital Partners in 2023, Polars launched Polars Cloud, a managed service that performs large-scale queries with no local setup. Next up is Polars Distributed, now in public beta, which expands the engine beyond a single machine to coordinated clusters fit for very large datasets.
That move puts Polars in line to more squarely compete with the heavyweights of distributed systems. And while Apache Spark continues to be the default choice for large-scale batch processing and underpins everything from Databricks’ platform to a wide range of cloud services, Polars is hoping to offer an easier route to speed: start on a laptop and then scale up to a managed cluster without having to rewrite pipelines or shift execution models.
Bridging the Gap Between Pandas and Spark
For many enterprises, data work awkwardly falls in the gap between one-node Pandas and a cluster of Spark. Teams are then left to decide between convenience and operational overhead as they evaluate that gap. Polars is hitting the middle, which is a compiled and vectorized engine with a path to scale up when data gets larger than one machine.
The approach is in line with the broader Arrow ecosystem and with open file formats such as Parquet, Iceberg, and Delta Lake. Polars can maintain performance in both dev and prod by natively reading columnar files efficiently and pushing down projections and filters to reduce I/O. The Python Developers Survey run by the PSF and JetBrains has consistently placed Pandas as one of the most popular tools; given Polars can provide a similar API with fewer feature compromises, there is an apparent adoption wedge.
How the Polars business plans to generate revenue
Open source adoption generates the top of the funnel, and Polars Cloud and enterprise features generate revenue. Anticipate paid plans that combine autoscaling, job orchestration, role-based access, and audit trails, as well as service-level agreements (SLAs), not to mention commercial support and consulting. That model reflects proven playbooks that companies have employed around PostgreSQL, Kafka, and Kubernetes: keep the core open source, but sell reliability, governance, and scale.
For data leaders, the appeal is practical. If a team can replace tens of CPU-hours on legacy pipelines with minutes on an even cheaper/smaller cluster (or even just one beefy node), the cost savings and developer productivity gains accrue rapidly. Clear migration paths from Pandas, along with straightforward connectors to object storage and data catalogs, will be paramount in turning that promise into contracts.
What Accel is betting on with Polars and its future
Accel partner Zhenya Loginov led the round, staking its faith that Polars can convert grassroots developer love into an enterprise platform. Investors often say that “rewritten in Rust” isn’t a moat on its own; what’s special here is the clean progression from local notebooks to managed clusters without changing paradigms. Bain Capital Partners’ continued involvement reflects the conviction that the market goes beyond single-node analytics.
If Polars Distributed runs well on petabyte-scale workloads, and if Cloud is a low-friction experience, then the market expands to teams who today end up with Spark by default due to its historical inertia rather than it necessarily being an optimal solution. Spark is a big chunk of the modern data stack.
Outlook and next steps for Polars Cloud and Distributed
The new capital will support hiring for its engineering and developer relations team, the hardening of the distributed scheduler, and expanding support for Arrow-native formats and cloud object store integrations. Look for more benchmarks against Spark and DuckDB on mixed workloads, plus enterprise guardrails to appease security teams.
It was the momentum around open source that brought Polars into existence. Turning that momentum into a lasting business now depends on the ease with which speed can be achieved at scale — such that a workflow started with a DataFrame on a laptop ends with governed, reliable analytics for the biggest datasets, no rewrites necessary.