3 links tagged with all of: transcription + speech-to-text
Click any tag below to further narrow down your results
Links
ElevenLabs introduced Scribe v2 Realtime, a Speech to Text model that transcribes live speech with a latency under 150 ms. It supports multiple languages and features like automatic language detection and voice activity detection, making it suitable for voice agents and real-time captioning. The model achieves 93.5% accuracy across various languages and is available through their API.
Voxtral has released two new speech-to-text models, Voxtral Mini Transcribe V2 for batch processing and Voxtral Realtime for live applications. Both models support 13 languages, offer high accuracy, and are designed for efficiency in various use cases like meeting transcription and voice applications.
Handy is a free, open-source speech-to-text application that works offline and prioritizes user privacy. Built with Tauri, it allows users to transcribe speech directly into text fields using configurable keyboard shortcuts, without sending audio to the cloud. The application supports various models for transcription and is designed to be extensible for further development by the community.