Quit Emailing Yourself

2 links tagged with all of: image-generation + computer-vision

Links

PixelFlow: Pixel-Space Generative Models with Flow

PixelFlow introduces a novel approach to image generation by operating directly in raw pixel space, eliminating the need for pre-trained Variational Autoencoders. This method enhances the image generation process with efficient cascade flow modeling, achieving a competitive FID score of 1.98 on the ImageNet benchmark while offering high-quality and semantically controlled image outputs. The work aims to inspire future developments in visual generation models.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ pixel-flow image-generation ✓ computer-vision ✓ + generative-models + semantic-control

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

The paper presents BLIP3-o, a family of fully open unified multimodal models that enhance both image understanding and generation. It introduces a diffusion transformer for generating CLIP image features, advocates for a sequential pretraining strategy, and proposes a high-quality dataset, BLIP3o-60k, to improve performance across various benchmarks. The models, along with code and datasets, are open-sourced to foster further research.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ multimodal image-generation ✓ computer-vision ✓ + deep-learning + open-source