3 links tagged with all of: machine-learning + data-curation
Click any tag below to further narrow down your results
Links
This article discusses the evolving role of data engineers in the age of AI, emphasizing the need to adapt data preparation strategies. It highlights the shift from traditional data workflows to flexible, context-aware systems that prioritize data curation over mere collection.
A new active learning method developed by Google significantly reduces the amount of training data required for fine-tuning large language models (LLMs) while enhancing alignment with human expert evaluations. This scalable curation process allows for the identification of the most informative examples and achieves up to a 10,000x reduction in training data, enabling more effective responses to the evolving challenges of ad safety content classification.
Fine-tuned small language models (LLMs) can outperform larger models while being significantly more cost-effective, achieving results at 5 to 30 times lower costs. This efficiency is attributed to programmatic data curation techniques that enhance the training process of these smaller models.