1 link tagged with all of: reinforcement-learning + tinylora + qwen
Click any tag below to further narrow down your results
Links
This article discusses TinyLoRA, a method developed by researchers at Meta that enhances a large language model's math reasoning by adjusting only 13 parameters. The findings suggest that minimal updates can yield significant improvements, though results may not apply broadly across other domains. It also explores the effectiveness of various GGUF models for coding tasks.