Quit Emailing Yourself

# pytorch

8 links tagged with pytorch

Click any tag below to further narrow down your results

+ machine learning (4) + debugging (3) + llm (3) + tensorflow (3) + machinelearning (3) + distributed (2) + ai (1) + rnn (1) + nvidia (1) + deepseek-ocr (1) + cli tool (1) + communication (1) + visualization (1) + tensors (1) + training (1)

Links

Retro Language Models: Rebuilding Karpathy’s RNN in PyTorch :: Giles' blog

The article discusses the implementation of Andrej Karpathy's original recurrent neural network (RNN) code using PyTorch, emphasizing hands-on coding to understand RNNs better. It also highlights the differences in dataset formatting for training RNNs compared to transformer-based language models. Future posts will delve deeper into the author's personal implementations of RNNs.

Saved by hn_user_8 · Last saved October 28, 2025 · 3 min read

+ rnn pytorch ✓ + ai

Getting DeepSeek-OCR working on an NVIDIA Spark via brute force using Claude Code

The article discusses the author's experience of deploying the DeepSeek-OCR model on an NVIDIA Spark using Claude Code, emphasizing the challenges faced with compatibility and dependencies in running a PyTorch CUDA model. The author details the process of setting up the environment, troubleshooting issues, and successfully executing OCR on an image after overcoming obstacles related to GPU capabilities and software versions.

Saved by hn_user_7 · Last saved October 28, 2025 · 3 min read

+ deepseek-ocr + nvidia pytorch ✓

The State of Machine Learning Frameworks in 2019

The article discusses the competitive landscape of machine learning frameworks in 2019, highlighting the shift from TensorFlow to PyTorch among researchers. It presents data showing PyTorch's growing dominance in academic publications while TensorFlow remains prevalent in industry applications. The author suggests that PyTorch's simplicity, API design, and community preference may hinder TensorFlow's future in research.

Saved by hn_user_2 · 2 others saved this · Last saved October 28, 2025 · 3 min read

pytorch ✓ + tensorflow + machinelearning + machine learning

GitHub - theaniketgiri/create-llm: The fastest way to build and start training your own LLM. CLI tool that scaffolds production-ready PyTorch training projects in seconds. Like create-next-app but for language models.

The article introduces "create-llm," a CLI tool designed to quickly scaffold production-ready PyTorch training projects for language models. It offers various templates tailored for different project scopes and includes essential features like data preprocessing, tokenizer training, and deployment tools, enabling users to train their own language models efficiently.

Saved by hn_user_1 · 2 others saved this · Last saved October 28, 2025 · 3 min read

+ llm pytorch ✓ + cli tool + training + cli

torchcomms: a modern PyTorch communications API – PyTorch

The article introduces torchcomms, a lightweight communication API designed for PyTorch Distributed, aimed at enhancing large-scale model training. It offers a flexible framework for rapid prototyping, supports scaling to over 100,000 GPUs, and emphasizes fault tolerance and device-centric communication. The development process is open to community feedback as it evolves towards comprehensive support for next-generation distributed technologies.

Saved by hn_user_14 · 1 other saved this · Last saved October 28, 2025 · 3 min read

pytorch ✓ + distributed + communication + torchcomms + distributed training

the bug that taught me more about PyTorch than years of using it | Elana Simon

The article discusses a challenging bug encountered while using PyTorch, which caused training loss to plateau due to a GPU kernel issue on the Apple Silicon MPS backend. After extensive debugging and investigation, the author uncovered the underlying problem related to non-contiguous memory layouts, ultimately leading to insights about PyTorch internals and the importance of understanding framework details in troubleshooting. The article serves as a guide for others who may face similar issues, offering a thorough walkthrough of the debugging process.

Saved by hn_user_4 · 2 others saved this · Last saved October 28, 2025 · 3 min read

+ debugging pytorch ✓ + machinelearning + gpu + machine learning

Draw high dimensional tensors as a matrix of matrices : ezyang’s blog

The article discusses a method for visualizing high-dimensional tensors by representing them as matrices of matrices, which helps in identifying the dimensions more clearly. The author demonstrates this technique with examples of tensors from 0D to 5D, explaining how to stack lower-dimensional matrices both horizontally and vertically to maintain clarity. Additionally, the article touches on the fractal nature of this representation and provides a knowledge check on splitting tensors using PyTorch functions.

Saved by hn_user_5 · Last saved October 28, 2025 · 3 min read

+ tensors + visualization pytorch ✓

Introducing PyTorch Monarch – PyTorch

The article introduces PyTorch Monarch, a new distributed programming framework designed to simplify the complexity of distributed machine learning workflows. By adopting a single controller model, Monarch allows developers to program clusters as if they were single machines, seamlessly integrating with PyTorch while managing processes and actors efficiently across large GPU clusters. It aims to enhance fault handling and data transfer, making distributed computing more accessible and efficient for ML applications.

Saved by hn_user_13 · 1 other saved this · Last saved October 28, 2025 · 3 min read

+ distributed pytorch ✓ + machine learning + distributed computing