Quit Emailing Yourself

Voxtral

1 min read | Saved October 29, 2025 | Copied!

audio-chat 🤖 multimodal 🤖 speech-understanding 🤖 machine-learning 🤖 local-deployment 🤖

Do you care about this?

Voxtral Mini and Voxtral Small are two multimodal audio chat models designed to understand both spoken audio and text. They achieve state-of-the-art performance on various audio benchmarks while maintaining strong text capabilities, with Voxtral Small being efficient enough for local deployment. The models include a 32K context window for processing lengthy audio and multi-turn conversations and come with three new benchmarks for evaluating speech understanding in knowledge and trivia.

If you do, here's more

Click "Generate Summary" to create a detailed 2-4 paragraph summary of this article.

Questions about this article

No questions yet.