Quit Emailing Yourself

# gpu → cuda

3 links tagged with all of: gpu + cuda

Click any tag below to further narrow down your results

Links

[no-title]

NVIDIA has introduced native Python support for its CUDA platform, which allows developers to write CUDA code directly in Python without needing to rely on additional wrappers. This enhancement simplifies the process of leveraging GPU capabilities for machine learning and scientific computing, making it more accessible for Python users.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

cuda ✓ + nvidia + python gpu ✓ + programming

Roadmap: Understanding GPU Architecture

This roadmap offers an introduction to GPU architecture for those new to the technology, emphasizing the differences between GPUs and CPUs. It outlines objectives such as understanding GPU features, implications for program construction in GPGPU, and specifics about NVIDIA GPU components. Familiarity with high-performance computing concepts may be beneficial but is not required.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

gpu ✓ + architecture cuda ✓ + programming + high-performance-computing

We reverse-engineered Flash Attention 4

The blog post details a reverse-engineering effort of Flash Attention 4 (FA4), a new CUDA kernel optimized for Nvidia's architecture, achieving a ~20% speedup over previous versions. It explores the kernel's architecture and asynchronous operations, making it accessible for software engineers without CUDA experience, while providing insights into its tile-based computation processes and optimizations for generative AI tasks.

Saved by tldr-importer · Last saved October 29, 2025 · 7 min read

+ flash-attention cuda ✓ gpu ✓ + neural-networks + optimization