Breaking News

Voice set to become AI’s next big interface, says ElevenLabs CEO

ElevenLabs co-founder and CEO Mati Staniszewski believes voice is emerging as the next major interface for artificial intelligence, reshaping how people interact with machines as AI systems move beyond text and screens.

Speaking at Web Summit in Doha, Staniszewski said recent advances in voice technology are changing the role of AI in everyday life. Voice models, he explained, are no longer limited to reproducing human speech with realistic emotion and intonation. Instead, they are increasingly being paired with the reasoning abilities of large language models, allowing AI systems to understand context and respond more naturally.

“In the years ahead, hopefully all our phones will go back in our pockets,” Staniszewski said, adding that voice could become the primary way people control technology while staying more engaged with the real world.

Voice gains momentum across the AI industry

That vision has helped fuel ElevenLabs’ rapid growth, including a recent $500 million funding round that valued the company at $11 billion. Staniszewski’s view is also gaining traction across the wider AI industry, as companies race to develop voice-first systems for future products.

OpenAI and Google have made voice a central focus of their next-generation AI models, while Apple has been quietly expanding its voice-related capabilities through acquisitions and internal development. As AI spreads into wearables, vehicles, and other connected devices, industry leaders increasingly see spoken interaction replacing keyboards and touchscreens for many everyday tasks.

At the same event, Iconiq Capital general partner Seth Pierrepont said traditional input methods are beginning to feel outdated. While screens will remain important for activities such as gaming and entertainment, he argued that voice will play a growing role as AI systems become more autonomous.

Agentic AI and privacy concerns

Staniszewski pointed to the rise of “agentic” AI as a major shift shaping voice interfaces. Rather than relying on detailed instructions, future systems will use persistent memory and accumulated context, reducing the need for users to explicitly prompt AI at every step.

This evolution is also influencing how voice technology is deployed. While high-quality audio models have largely relied on cloud processing, ElevenLabs is working toward a hybrid approach that combines cloud and on-device capabilities. The goal is to support always-on voice experiences in headphones, smart glasses, and other wearables.

ElevenLabs has already partnered with Meta to integrate its voice technology into products such as Instagram and Horizon Worlds, and Staniszewski said he is open to expanding collaboration into new hardware formats, including smart glasses.

However, as voice-based AI becomes more persistent and embedded in daily life, it raises growing concerns around privacy, surveillance, and the collection of personal data. As companies push voice closer to users, how that data is stored and used is likely to become a central issue in the next phase of AI adoption.