Higgs Audio v2 has been open-sourced, showcasing its capabilities in expressive audio generation through advanced training on a vast dataset without post-training or fine-tuning. It excels in various benchmarks, demonstrating unique features such as multilingual dialogue generation and simultaneous speech and music creation, alongside providing advanced usage through an OpenAI compatible API server.
Chatterbox Multilingual is Resemble AI's open-source TTS model that supports 23 languages and features emotion exaggeration control with zero-shot voice cloning. It has been benchmarked against leading systems and offers ultra-low latency for production use, making it suitable for various applications. The model is available for installation and includes watermarking for generated audio files.