Quit Emailing Yourself

# llm → deep-learning → aws-sagemaker → fine-tuning → reasoning

1 link tagged with all of: llm + deep-learning + aws-sagemaker + fine-tuning + reasoning

GitHub - EsmaeilNarimissa/aws-sft-grpo-budget-llm-finetune

Fine-tuning an instruction-tuned LLM (Qwen2.5B) for reasoning tasks is achieved using a cost-effective pipeline inspired by DeepSeek R1, implementing Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO) on AWS SageMaker. The article details the training stages, reward function design, and experimental outcomes, providing guidance for replicating the results and utilizing the associated codebase.

Saved by tldr-importer · Last saved October 29, 2025 · 5 min read

fine-tuning ✓ llm ✓ reasoning ✓ aws-sagemaker ✓ deep-learning ✓

Links

GitHub - EsmaeilNarimissa/aws-sft-grpo-budget-llm-finetune