Quit Emailing Yourself

GitHub - facebookresearch/zero: PyTorch Implementation of Zero-Shot Vision Encoder Grafting via LLM Surrogates [ICCV'25]

The article provides an overview of a codebase for training language and vision-language models using PyTorch, highlighting installation instructions, model inference, and training setup. It details the required dependencies, configuration paths, and methods for integrating new datasets and models, while also addressing the usage of various GPU resources for efficient training and evaluation.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

pytorch ✓ + vision-language + model-training inference ✓ + evaluation

Disaggregated Inference at Scale with PyTorch & vLLM

PyTorch and vLLM have been integrated to enhance generative AI applications by implementing Prefill/Decode Disaggregation, which improves inference efficiency at scale. This collaboration has optimized Meta's internal inference stack by allowing independent scaling of prefill and decode processes, resulting in better performance metrics. Key optimizations include enhanced KV cache transfer and load balancing, ultimately leading to reduced latency and increased throughput.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

pytorch ✓ + vllm + generative-ai inference ✓ + optimization

Links

GitHub - facebookresearch/zero: PyTorch Implementation of Zero-Shot Vision Encoder Grafting via LLM Surrogates [ICCV'25]

Disaggregated Inference at Scale with PyTorch & vLLM