Quit Emailing Yourself

6 links tagged with all of: computer-vision + machine-learning

Click any tag below to further narrow down your results

Links

GitHub - facebookresearch/ShapeR: Code for the ShapeR research paper

ShapeR offers a method for generating 3D shapes from image sequences. It processes input images to extract relevant data, then uses a transformer model to create a mesh representation of each object in the scene. The project includes tools for setup, data exploration, and evaluation.

Saved by tldr-importer · Last saved February 14, 2026 · 2 min read

+ 3d-modeling + shape-generation computer-vision ✓ machine-learning ✓ + dataset

ocr models explained 🏙️

The article explains how optical character recognition (OCR) models, like deepseek-ocr, process images of text into machine-readable formats. It details the roles of the encoder and decoder in transforming visual data into structured text while highlighting the advancements in learning techniques that reduce the need for manual coding.

Saved by tldr-importer · Last saved February 14, 2026 · 3 min read

+ ocr + ai machine-learning ✓ computer-vision ✓ + deep-learning

GitHub - ByteDance-Seed/Depth-Anything-3: Depth Anything 3

Depth Anything 3 (DA3) is a model designed for accurate depth estimation and 3D geometry recovery from various visual inputs, regardless of camera pose. It simplifies the process using a single transformer backbone and a depth-ray representation, outperforming previous models in both monocular and multi-view scenarios. Various specialized models within the DA3 series cater to different depth estimation tasks.

Saved by tldr-importer · Last saved February 14, 2026 · 6 min read

+ depth-estimation + 3d-geometry computer-vision ✓ machine-learning ✓ + github

GitHub - facebookresearch/pippo: Pippo: High-Resolution Multi-View Humans from a Single Image

Pippo is a generative model designed to create high-resolution dense turnaround videos of individuals from a single casual photograph, utilizing a multi-view diffusion transformer without the need for additional inputs. The codebase includes training configurations for various resolutions, sample training code, and methods for preparing custom datasets. Future updates are planned to enhance the functionality and usability of the model.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ generative-model + video-synthesis machine-learning ✓ + diffusion-transformer computer-vision ✓

[no-title]

The article discusses advancements in computer vision technology, focusing on its applications in various industries, such as healthcare and automotive. It highlights the importance of machine learning and artificial intelligence in enhancing the accuracy and efficiency of visual recognition systems. The potential future developments in this field are also explored, emphasizing the transformative impact on society.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

computer-vision ✓ machine-learning ✓ + artificial-intelligence + technology + applications

[no-title]

The article discusses advancements in image segmentation techniques, particularly focusing on the Gemini model and its implications for various applications in computer vision. It highlights the improvements in accuracy and efficiency over previous models, as well as the potential for broader use in sectors such as healthcare and autonomous vehicles.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ image-segmentation + gemini computer-vision ✓ machine-learning ✓ + healthcare