Quit Emailing Yourself

# machine-learning → multimodal → speech-understanding → local-deployment

1 link tagged with all of: machine-learning + multimodal + speech-understanding + local-deployment

Voxtral

Voxtral Mini and Voxtral Small are two multimodal audio chat models designed to understand both spoken audio and text. They achieve state-of-the-art performance on various audio benchmarks while maintaining strong text capabilities, with Voxtral Small being efficient enough for local deployment. The models include a 32K context window for processing lengthy audio and multi-turn conversations and come with three new benchmarks for evaluating speech understanding in knowledge and trivia.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ audio-chat multimodal ✓ speech-understanding ✓ machine-learning ✓ local-deployment ✓

Links

Voxtral