Quit Emailing Yourself

# nvidia → machine-learning

3 links tagged with all of: nvidia + machine-learning

Click any tag below to further narrow down your results

Links

RADIO - a nvidia Collection

The article presents a collection of Foundation Vision Models developed by NVIDIA, which integrate various models such as CLIP, DINOv2, and SAM for enhanced image feature extraction. Several versions of these models are listed, including their sizes and update statuses, indicating ongoing development and improvements.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

nvidia ✓ + vision-models + image-extraction machine-learning ✓ + deep-learning

Accelerating MoE’s with a Triton Persistent Cache-Aware Grouped GEMM Kernel

An optimized Triton BF16 Grouped GEMM kernel is presented, achieving up to 2.62x speedup over the manual PyTorch implementation for Mixture-of-Experts (MoE) models like DeepSeekv3 on NVIDIA H100 GPUs. The article details several optimization techniques, including persistent kernel design, grouped launch ordering for improved cache performance, and efficient utilization of the Tensor Memory Accelerator (TMA) for expert weights. End-to-end benchmarking results demonstrate significant improvements in training throughput.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ triton + gemm + optimization nvidia ✓ machine-learning ✓

Rowhammer Attack Demonstrated Against Nvidia GPU - SecurityWeek

Researchers have successfully demonstrated a Rowhammer attack against the GDDR6 memory of an NVIDIA A6000 GPU, revealing that a single bit flip could drastically reduce the accuracy of deep neural network models from 80% to 0.1%. Nvidia has acknowledged the findings and suggested enabling error-correcting code (ECC) as a mitigation strategy, although it may impact performance and memory capacity. The researchers have also created a dedicated website for their proof-of-concept code and shared their detailed findings in a published paper.

Saved by tldr-importer · Last saved October 29, 2025 · 4 min read

+ rowhammer + gpu nvidia ✓ + cybersecurity machine-learning ✓