5 links
tagged with all of: machine-learning + image-generation
Click any tag below to further narrow down your results
Links
FLUX.1 Kontext [pro] is an advanced image generation and editing model that emphasizes prompt adherence. The article provides several examples of API usage for tasks such as image generation, chat completions, and audio processing using this model, although it is currently unsupported on Together AI.
REPA-E introduces a family of end-to-end tuned Variational Autoencoders (VAEs) that significantly improve text-to-image (T2I) generation quality and training efficiency. The method enables effective joint training of VAEs and diffusion models, achieving state-of-the-art performance on ImageNet and enhancing latent space structure across various VAE architectures. Results show accelerated generation performance and better image quality, making E2E-VAEs superior replacements for traditional VAEs.
Representation Autoencoders (RAEs) enhance diffusion transformers by leveraging pretrained encoders and lightweight decoders to achieve superior image generation results, outperforming traditional methods like SD-VAE. The study reveals that RAE's reconstruction quality is high, and for optimal performance, the model width must match or exceed the encoder's token dimension. Additionally, the proposed DiTDH model demonstrates significant efficiency and effectiveness, setting new state-of-the-art scores in image generation tasks.
GigaTok is a novel method designed for scaling visual tokenizers to 3 billion parameters, addressing the reconstruction vs. generation dilemma through semantic regularization. It offers a comprehensive framework for training and evaluating tokenizers, alongside various model configurations and instructions for setup and usage. The project is a collaboration involving extensive research and experimentation, with resources available for further exploration.
HiDream-I1 is an open-source image generative foundation model boasting 17 billion parameters, delivering high-quality image generation in seconds. Its recent updates include the release of various models and integrations with popular platforms, enhancing its usability for developers and users alike. For full capabilities, users can explore additional resources and demos linked in the article.