Quit Emailing Yourself

We Got Claude to Fine-Tune an Open Source LLM

6 min read | Saved February 14, 2026 | Copied!

hugging-face 🤖 fine-tuning 🤖 language-models 🤖 cloud-gpus 🤖 training-methods 🤖

Do you care about this?

This article explains how to use Hugging Face Skills to fine-tune language models with Claude. It covers the setup, training methods, and how to monitor progress, making it easier to customize and deploy models on the Hugging Face Hub.

If you do, here's more

Claude can now fine-tune language models using a new tool from Hugging Face called Hugging Face Skills. This tool allows Claude to not only write training scripts but also submit jobs to cloud GPUs, track progress, and push models to the Hugging Face Hub. For example, users can instruct Claude to fine-tune a model like Qwen3-0.6B on a specific dataset, and Claude takes care of the details, including validating data formats, selecting the right GPU, and using monitoring tools. The entire process runs asynchronously, meaning users can focus on other tasks while the model trains.

To get started, users need a Hugging Face account with a Pro or Team plan, a write-access token, and a coding agent like Claude Code or Codex. The setup involves adding plugins and configuring authentication with the Hugging Face Hub. Once prepared, users can issue commands to train models, with Claude generating the necessary configurations and estimates for time and cost. For instance, fine-tuning a 0.6B model can cost around $0.30 and take about 20 minutes.

The skill supports various training methods, including Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Group Relative Policy Optimization (GRPO). SFT works best with high-quality demonstration data, while DPO focuses on preference pairs that align model outputs with human choices. GRPO applies reinforcement learning techniques to tasks requiring verifiable results, such as coding or math problem-solving. Each method has specific requirements for data formats and setup, ensuring users can optimize training based on their needs.

Questions about this article

No questions yet.