5 links
tagged with all of: language-models + fine-tuning
Click any tag below to further narrow down your results
Links
LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method that allows large language models to be updated with fewer parameters, making post-training faster and more resource-efficient. Recent experiments show that LoRA can achieve performance comparable to full fine-tuning (FullFT) under certain conditions, particularly with small-to-medium-sized datasets, but may struggle with larger datasets and high batch sizes. Key findings suggest a "low-regret regime" where LoRA's efficiency aligns with FullFT, paving the way for its broader application in various scenarios.
The article explores subliminal learning in language models, where fine-tuning on seemingly unrelated data (like numbers) can lead to the acquisition of hidden preferences (e.g., a model developing a liking for "owls"). It introduces the concept of entangled tokens, where the probability of one token can influence another, and discusses experiments that demonstrate how this phenomenon can be harnessed through prompting and dataset generation. The findings suggest both a mechanism for subliminal learning and potential strategies for mitigating its effects.
Together AI has launched a Fine-Tuning Platform that allows developers to refine language models based on user preferences and ongoing data. With features like Direct Preference Optimization and a new web UI for easy access, businesses can continuously improve their models, ensuring they evolve alongside user needs and application trends. Pricing changes also make fine-tuning more accessible for developers.
Tinker is a newly launched API designed for fine-tuning language models, allowing researchers to easily customize and experiment with various models without managing the underlying infrastructure. The service supports both large and small models and is currently in private beta, with plans for onboarding users and introducing usage-based pricing soon.
Large language models (LLMs) typically cannot adapt their weights dynamically to new tasks or knowledge. The Self-Adapting LLMs (SEAL) framework addresses this limitation by allowing models to generate their own finetuning data and directives for self-adaptation through a reinforcement learning approach, resulting in persistent weight updates and improved performance in knowledge incorporation and few-shot generalization tasks.