Quit Emailing Yourself

# kubernetes → batching

2 links tagged with all of: kubernetes + batching

Click any tag below to further narrow down your results

Links

Kubernetes v1.35: Introducing Workload Aware Scheduling

Kubernetes v1.35 introduces workload aware scheduling, enhancing how multiple Pods are scheduled together. It features a new Workload API for defining scheduling requirements and supports gang scheduling to optimize resource use for large workloads. The update also includes opportunistic batching to speed up scheduling for identical Pods.

Saved by tldr-importer · Last saved February 14, 2026 · 5 min read

kubernetes ✓ + scheduling + workloads + api batching ✓

Large Scale Distributed LLM Inference with Kubernetes | by Kshitiz Lohia | GoPenAI

This article explains how to implement large-scale inference for language models using Kubernetes. It covers key concepts like batching strategies, performance metrics, and intelligent routing to optimize GPU usage. Practical deployment examples and challenges in managing inference are also discussed.

Saved by tldr-importer · Last saved February 14, 2026 · 4 min read

kubernetes ✓ + llm + inference batching ✓ + gpu