3 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Bytedance has introduced Seedance 2.0, a multimodal AI video generation tool that combines images, videos, audio, and text to create short clips with automatic sound effects. The model features a unique reference capability, allowing users to replicate camera work and effects from uploaded videos. This release coincides with increased competition from Kuaishou's Kling 3.0, boosting share prices in the Chinese media and AI sectors.
If you do, here's more
ByteDance has launched Seedance 2.0, an advanced AI video generation model that processes multiple input types—images, videos, audio, and text—simultaneously. This new model allows users to combine up to nine images, three videos, and three audio files, resulting in short videos ranging from 4 to 15 seconds with automatic sound effects. A key feature of Seedance 2.0 is its reference capability, enabling it to adopt camera work, movements, and effects from uploaded reference videos. Users can easily input commands to dictate the video's elements and style.
The model’s capabilities come at a time when the competition is heating up, especially after Kuaishou’s release of its Kling 3.0 model, which also utilizes a multimodal approach. The introduction of these advanced video models has influenced the stock market, boosting share prices for Chinese media and AI companies by as much as 20%. Seedance 2.0 is currently in beta and only available to a limited group of users, with restrictions on realistic human faces in uploaded materials for compliance reasons.
Demo videos from ByteDance showcase the model's potential, but these examples may not reflect consistent real-world performance. Questions remain about the model's reliability, cost, and the time it takes to generate videos. Despite these uncertainties, the quality of the generated content appears strong, indicating significant progress in AI-driven video technology.
Questions about this article
No questions yet.