ElevenLabs, a New York City-based AI startup, has significantly expanded the language capabilities of its AI text-to-speech model, now supporting 70 languages and making it accessible to approximately 90% of the global population.
The expansion was implemented on the Eleven V3 (alpha) model, launched on June 8th as the company’s “most expressive TTS model.” The company announced the update via their official X account, formerly known as Twitter, stating that the addition of 41 new languages broadens the model’s utility for content creators and businesses.
The newly supported languages include Arabic, Assamese, Bengali, Bulgarian, Catalan, Gujarati, Latvian, Malay, Malayalam, Marathi, Nepali, Swahili, Tamil, and Telugu. To generate text in these new languages, users are advised to record an Instant Voice Clone (IVC) while selecting the desired language. ElevenLabs also plans to add Voice Library voices for the newly supported languages in the coming weeks.
Eleven V3 builds upon the foundation of the multilingual V2 and V2.5 TTS models, with key features including support for inline audio tags such as “whispers,” “excited,” and “sighs.” These tags allow users to infuse emotional nuances and non-verbal cues into the generated audio, resulting in a more dramatic and engaging delivery.
The model also supports multi-speaker interactions, complete with interruptions, natural pacing, and overlapping dialogues, creating a more realistic conversational experience. ElevenLabs emphasizes that Eleven V3 demonstrates improved handling of elements like stress, cadence, and contextual awareness.
The Eleven V3 model is currently accessible through the company’s website and mobile apps, although it is not yet available as an application programming interface (API). This update follows the introduction of Agent Transfer in April, an enterprise-focused agentic feature designed for Conversational AI, enabling two AI agents to communicate with each other and hand off conversations to a more specialized agent.




