Quit Emailing Yourself

5 links tagged with all of: language-models + fine-tuning

Click any tag below to further narrow down your results

Links

LoRA Without Regret

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method that allows large language models to be updated with fewer parameters, making post-training faster and more resource-efficient. Recent experiments show that LoRA can achieve performance comparable to full fine-tuning (FullFT) under certain conditions, particularly with small-to-medium-sized datasets, but may struggle with larger datasets and high batch sizes. Key findings suggest a "low-regret regime" where LoRA's efficiency aligns with FullFT, paving the way for its broader application in various scenarios.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ lora fine-tuning ✓ + machine-learning language-models ✓ + parameter-efficient

It's Owl in the Numbers: Token Entanglement in Subliminal Learning

The article explores subliminal learning in language models, where fine-tuning on seemingly unrelated data (like numbers) can lead to the acquisition of hidden preferences (e.g., a model developing a liking for "owls"). It introduces the concept of entangled tokens, where the probability of one token can influence another, and discusses experiments that demonstrate how this phenomenon can be harnessed through prompting and dataset generation. The findings suggest both a mechanism for subliminal learning and potential strategies for mitigating its effects.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ subliminal-learning + entangled-tokens language-models ✓ fine-tuning ✓ + token-probability

Together Fine-Tuning Platform, Now With Preference Optimization and Continued Training

Together AI has launched a Fine-Tuning Platform that allows developers to refine language models based on user preferences and ongoing data. With features like Direct Preference Optimization and a new web UI for easy access, businesses can continuously improve their models, ensuring they evolve alongside user needs and application trends. Pricing changes also make fine-tuning more accessible for developers.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

fine-tuning ✓ + ai language-models ✓ + preference-optimization + continued-training

Announcing Tinker

Tinker is a newly launched API designed for fine-tuning language models, allowing researchers to easily customize and experiment with various models without managing the underlying infrastructure. The service supports both large and small models and is currently in private beta, with plans for onboarding users and introducing usage-based pricing soon.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ tinker + api language-models ✓ fine-tuning ✓ + research

Self-Adapting Language Models

Large language models (LLMs) typically cannot adapt their weights dynamically to new tasks or knowledge. The Self-Adapting LLMs (SEAL) framework addresses this limitation by allowing models to generate their own finetuning data and directives for self-adaptation through a reinforcement learning approach, resulting in persistent weight updates and improved performance in knowledge incorporation and few-shot generalization tasks.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ self-adaptation + machine-learning language-models ✓ + reinforcement-learning fine-tuning ✓