2 links tagged with all of: machine-learning + deep-learning + video-generation
Click any tag below to further narrow down your results
Links
Saber is a zero-shot framework for reference-to-video generation that relies solely on video-text pairs instead of costly reference image-video-text triplets. It uses masked training with dynamic substitutes to enhance subject integration and generalization across diverse scenarios. The model shows improved performance in generating videos that maintain subject identity while following text prompts.
MAGI-1 is an autoregressive video generation model that creates videos by predicting sequences of fixed-length video chunks, achieving high temporal consistency and scalability. It incorporates innovations such as a transformer-based variational autoencoder and a unique denoising algorithm, enabling efficient and controllable video generation from text or images. The model has shown state-of-the-art performance in both instruction following and physical behavior prediction compared to existing models.