Click any tag below to further narrow down your results
Links
ElevenLabs introduced Scribe v2 Realtime, a Speech to Text model that transcribes live speech with a latency under 150 ms. It supports multiple languages and features like automatic language detection and voice activity detection, making it suitable for voice agents and real-time captioning. The model achieves 93.5% accuracy across various languages and is available through their API.
The Live API enables developers to create low-latency applications that process streaming audio, video, and text, enhancing interactive experiences in various fields. Recent updates include features like longer session support, session resumption, and expanded language options, making it suitable for real-time applications such as customer support and educational tools. Examples of innovative uses showcase its capabilities in voice-based games and AI assistants for truck drivers.