Quit Emailing Yourself

# machine-learning → fine-tuning → language-models

2 links tagged with all of: machine-learning + fine-tuning + language-models

Links

LoRA Without Regret

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method that allows large language models to be updated with fewer parameters, making post-training faster and more resource-efficient. Recent experiments show that LoRA can achieve performance comparable to full fine-tuning (FullFT) under certain conditions, particularly with small-to-medium-sized datasets, but may struggle with larger datasets and high batch sizes. Key findings suggest a "low-regret regime" where LoRA's efficiency aligns with FullFT, paving the way for its broader application in various scenarios.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ lora fine-tuning ✓ machine-learning ✓ language-models ✓ + parameter-efficient

Self-Adapting Language Models

Large language models (LLMs) typically cannot adapt their weights dynamically to new tasks or knowledge. The Self-Adapting LLMs (SEAL) framework addresses this limitation by allowing models to generate their own finetuning data and directives for self-adaptation through a reinforcement learning approach, resulting in persistent weight updates and improved performance in knowledge incorporation and few-shot generalization tasks.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ self-adaptation machine-learning ✓ language-models ✓ + reinforcement-learning fine-tuning ✓