2 links tagged with all of: video-generation + transformers + machine-learning
Click any tag below to further narrow down your results
Links
STARFlow and STARFlow-V are open-source models designed for generating high-quality images and videos from text prompts. They combine autoregressive models with normalizing flows to achieve impressive results in both text-to-image and text-to-video tasks. Users can easily set up the models and start generating content with provided scripts and configurations.
Test-Time Training (TTT) layers enhance pre-trained Transformers' ability to generate one-minute videos from text narratives, yielding improved coherence and aesthetics compared to existing methods. Despite notable artifacts and limitations in the current implementation, TTT-MLP shows significant advancements in temporal consistency and motion smoothness, particularly when tested on a dataset of Tom and Jerry cartoons. Future work aims to extend this approach to longer videos and more complex storytelling.