Tekmono
  • News
  • Guides
  • Lists
  • Reviews
  • Deals
No Result
View All Result
Tekmono
No Result
View All Result
Home News
Mistral Launches Voxtral Open-Source Speech Models

Mistral Launches Voxtral Open-Source Speech Models

by Tekmono Editorial Team
16/07/2025
in News
Share on FacebookShare on Twitter

Voxtral has launched new open-source speech understanding models, aiming to revolutionize human-computer interaction by making voice interfaces more reliable and accessible. These state-of-the-art models are available under the Apache 2.0 license.

The models, available in 24B and 3B variants, offer exceptional transcription and deep understanding capabilities, addressing limitations of current proprietary and open-source systems. Voxtral bridges the gap between high-cost, closed APIs and less accurate open-source alternatives, providing state-of-the-art accuracy and native semantic understanding at less than half the price of comparable APIs. The models support long-form audio up to 30 minutes for transcription and 40 minutes for understanding, featuring a 32k token context length. Additionally, they include built-in Q&A and summarization, automatic language detection for widely used languages such as English, Spanish, French, Portuguese, Hindi, German, Dutch, and Italian, and direct function-calling from voice commands.

In benchmarks, Voxtral significantly outperforms leading open-source models like Whisper large-v3 and competes strongly with GPT-4o mini Transcribe and Gemini 2.5 Flash in speech transcription and audio understanding. For instance, Voxtral Mini Transcribe is more cost-effective than OpenAI Whisper, while Voxtral Small matches ElevenLabs Scribe’s performance at a lower price point. The models also retain strong text understanding capabilities from their Mistral Small 3.1 backbone.

Related Reads

Google opens applications for Gemini App Trusted Tester program

Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature

Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran

SpaceX acquires AI coding startup Cursor for $60 billion in strategic move

Voxtral models are available for local download on Hugging Face and via API, with pricing starting at $0.001 per minute. Enterprise features include private deployment, domain-specific fine-tuning, and advanced context capabilities like speaker identification and emotion detection. Future updates will include speaker segmentation, audio markups, and word-level timestamps, further enhancing their utility.

ShareTweet

You Might Be Interested

Google opens applications for Gemini App Trusted Tester program
News

Google opens applications for Gemini App Trusted Tester program

17/06/2026
Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature
News

Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature

17/06/2026
Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran
News

Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran

17/06/2026
SpaceX acquires AI coding startup Cursor for  billion in strategic move
News

SpaceX acquires AI coding startup Cursor for $60 billion in strategic move

17/06/2026
Please login to join discussion

Recent Posts

  • Google opens applications for Gemini App Trusted Tester program
  • Claude Voice Mode upgrade adds multilingual support and new Push-to-talk feature
  • Pentagon confirms use of Elon Musk’s Grok AI in missile strikes on Iran
  • SpaceX acquires AI coding startup Cursor for $60 billion in strategic move
  • Qualcomm unveils Snapdragon Reality Elite as next-gen XR platform

Recent Comments

No comments to show.
  • News
  • Guides
  • Lists
  • Reviews
  • Deals
Tekmono is a Linkmedya brand. © 2015.

No Result
View All Result
  • News
  • Guides
  • Lists
  • Reviews
  • Deals

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.