FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Technology

Mati Staniszewski discusses Voice AI at Disrupt 2025

John Melendez
Last updated: September 10, 2025 4:06 pm
By John Melendez
SHARE

Artificial intelligence voice is having a moment and few leaders are closer to the edge than Mati Staniszewski, co-founder and CEO of ElevenLabs.

Table of Contents
  • Why voice becomes the next interface
  • From clones to characters: what tech can do now
  • Access and inclusion as design mandate
  • Guardrails: consent, provenance, and policy
  • Metrics That Matter to Builders and Buyers
  • What we hear at Disrupt

At Disrupt 2025, Staniszewski will break down where synthetic speech goes next — from realism and latency, to new creative and accessibility use cases, and the thorny questions of consent, provenance and safety.

Mati Staniszewski on Voice AI at TechCrunch Disrupt conference

Why voice becomes the next interface

AI for years had meant typing prompts and having results read back. And Voice alters the rhythm of computing. Conversation demands sub-second latencies, emotional nuance and the ability to interrupt, also known as “barge in,” naturally — attributes that humans expect but machines rarely execute well. Research benchmarks such as Mean Opinion Score indicate that the best neural text-to-speech systems are capable of reaching over 4.2 out of 5, creeping closer to studio-quality narration. The desire Staniszewski often articulates is not just to sound human, but to respond like one.

That change transforms assistants into partners. If models can understand context, modulate tone, and switch languages midsentence without missing a beat, voice becomes the most intuitive interface for anything from customer support to creative direction.

From clones to characters: what tech can do now

Today’s voice systems can synthesize speech from even shorter samples, maintain the speaker’s timbre across dozens of languages, and deliver streaming audio with near-real-time response times. One of the fastest-growing applications of media localization: studios and streamers are using AI to dub films, series, and documentaries so they can reach global audiences, while preserving the performances of actors.

Real-world examples abound. An AI voice was used to bring Val Kilmer back to a semblance of his on-screen self in Top Gun: Maverick. Spotify experimented with AI-powered translation that cloned podcasters’ voices in new languages. Game developers are prototyping “dynamic” non-player characters who can improvise speaking lines that align with lore and character arcs. Meanwhile, publishers are looking at AI narration to scale audiobooks, backlist and frontlist, with stylistic control beyond a robotic monotone read.

Access and inclusion as design mandate

Where the most dramatic applications may be in accessibility. The ALS Association estimates that the overwhelming majority of ALS patients will face severe speech impairment and initiatives like Project Revoice and VocaliD have demonstrated how banked voice samples can help individuals maintain their own vocal identity. High-fidelity, low-latency synthesis takes that promise even further — offering everyday conversation, education, and employment opportunities that had once seemed impossible.

Education also benefits. Realistic storytelling and adaptive tutoring can increase engagement for language learners and those with dyslexia, and multilingual synthetic voices can render mother-tongue content on a mass scale. The throughline, which Staniszewski is fond of emphasizing, is control: Users must have the power to decide when, how and by whom their voice is employed.

Guardrails: consent, provenance, and policy

And often, the higher the quality, the greater the risk. Voice is a biometric identifier whose misuse could lead to fraud and reputational damage. Policymakers are moving quickly. In the U.S., the Federal Communications Commission has said that voice robocallers that use AI are illegal under the Telephone Consumer Protection Act. This classifies risk, including documentation and transparency for your deep fake system under the European Union’s AI Act.

All of which is leaving industry groups to steer a technologically rocky course.

Mati Staniszewski discusses Voice AI at TechCrunch Disrupt, trends in speech technology

The Coalition for Content Provenance and Authenticity is furthering content credentials so listeners can confirm when sound is artificial. The Partnership on AI has released guidelines on consent and authorization for synthetic media. Many labs, including ElevenLabs, have investigated watermarking and classifier-based detection to flag generated audio tools that need to be resilient through compression, noise, and adversarial edits to work at scale.

OpenAI’s choice to restrict access to its Voice Engine as well as other closed pilots demonstrates the calculus: capability releases have to be balanced with safety tooling, user consent flows, and monitoring.

Look for Staniszewski to address consent management, licensed data, and response when content is reported as abusive.

Metrics That Matter to Builders and Buyers

But beyond demos, enterprises purchase based on measurable performance. Key measurements include latencies for end-to-end dialogue (preferably below 300 ms in natural turn-taking), MOS values for perceived quality, the ability to understand in noisy conditions, the cross-lingual consistency, as well as the ability to control emotion. Speaking in operational terms, cost per produced minute, scalability in spite of bursty demand and whether deployment is on-device or in the cloud matter the most.

Risk management frameworks from NIST recommend that organizations consider not only accuracy, but resilience, security, and human oversight. For voice systems, this translates to abuse detection pipelines, auditable logs, and obvious disclosure UX—particularly around customer service and political communications.

What we hear at Disrupt

Staniszewski’s perspective is distinct: ElevenLabs early on helped un-silo realistic TTS for any and all creators and businesses, and now works in a world in which quality is no longer the bottleneck — trust is. Features to lookout for include multilingual dubbing saving performance, enterprisewide security features and remuneration models for voice owners whose appearance is at the heart of commercial projects.

Equally important for the seams: how voice AI stitches together with speech recognition, retrieval, and agentic planning to keep long, context-rich conversations going; how barge-in and emotion control are achieved without jarring artifacts; and how provenance signals will be embedded so that listeners can tell what’s human, what’s synthetic and when that distinction should be front and center.

The next phase of voice AI won’t be about novelty, but about reliability, ethics and scope. If the conversation at Disrupt lives up to those themes, it will provide a blueprint for how synthetic speech can extend human expression — without undermining the trust that makes voice so potent in the first place.

Latest News
Nothing OS 4.0 is coming Soon, Phone 1 users teased a surprise
LeydenJar’s Silicon Anodes take the fight to China
Series 11 Features on Older Apple Watches
5) Leaders on Weighing Innovation and Risk
Apple Watch Series 11: Smarter, Faster, Harder
I Tried Every iPhone 17: Here’s What Apple Got Right—and Wrong
Bending Spoons to Acquire Vimeo in $1.38B Cash Deal
5 Leaders That Expertly Balance Innovation, Risk and Execution
Does Phone Thickness Count The Camera ‘Bump’?
iPhone 17 gets huge anti-spyware update
JWST spies ultra-thin gas around dwarf planet Makemake
iPhone Air vs Samsung S25 Edge-And The Winner Is Clear
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.