Quit Emailing Yourself

# machine-learning → multimodal → speech-understanding

1 link tagged with all of: machine-learning + multimodal + speech-understanding

Click any tag below to further narrow down your results

Links

Voxtral

Voxtral Mini and Voxtral Small are two multimodal audio chat models designed to understand both spoken audio and text. They achieve state-of-the-art performance on various audio benchmarks while maintaining strong text capabilities, with Voxtral Small being efficient enough for local deployment. The models include a 32K context window for processing lengthy audio and multi-turn conversations and come with three new benchmarks for evaluating speech understanding in knowledge and trivia.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ audio-chat multimodal ✓ speech-understanding ✓ machine-learning ✓ + local-deployment