Quit Emailing Yourself

# azure → gpu

2 links tagged with all of: azure + gpu

Click any tag below to further narrow down your results

Links

Breaking the Million-Token Barrier

Azure's ND GB300 v6 virtual machines achieved a record-breaking performance of 1.1 million tokens per second on the Llama2 70B model. This surpasses the previous record by 27% and features enhanced hardware optimizations for better inference workloads. The results were verified by Signal65.

Saved by tldr-importer · Last saved February 14, 2026 · 4 min read

azure ✓ + performance + inference + llama2 gpu ✓

Enterprise AKS Multi-Instance GPU (MIG) vLLM Deployment Guide | Microsoft Community Hub

A comprehensive guide for deploying AI models using vLLM on Azure Kubernetes Service (AKS) with NVIDIA H100 GPUs and Multi-Instance GPU (MIG) technology is provided. It outlines the necessary prerequisites, steps for infrastructure creation, GPU component installation, and model deployment, enabling efficient utilization of resources and cost savings through hardware isolation.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

azure ✓ + aks gpu ✓ + vllm + deployment-guide