3 links
tagged with all of: language-models + dataset
Click any tag below to further narrow down your results
Links
REverse-Engineered Reasoning (REER) introduces a novel approach to instilling deep reasoning in language models by working backwards from known solutions to discover the underlying reasoning process. This method addresses the limitations of traditional reinforcement learning and instruction distillation, resulting in the creation of a large dataset, DeepWriting-20K, and a model, DeepWriter-8B, that outperforms existing models in open-ended tasks. The research emphasizes the importance of structured reasoning and iterative refinement in generating high-quality outputs.
Weak-to-Strong Decoding (WSD) is a novel framework designed to enhance the alignment capabilities of large language models (LLMs) by utilizing a smaller aligned model to guide the initial drafting of responses. By integrating a well-aligned draft model, WSD significantly improves the quality of generated content while minimizing the alignment tax, as demonstrated through extensive experiments and the introduction of the GenerAlign dataset. The framework provides a structured approach for researchers to develop safe AI systems while navigating the complexities of preference alignment.
EleutherAI has released the Common Pile v0.1, an 8 TB dataset of openly licensed and public domain text for training large language models, marking a significant advancement from its predecessor, the Pile. The initiative emphasizes the importance of transparency and openness in AI research, aiming to provide researchers with essential tools and a shared corpus for better collaboration and accountability in the field. Future collaborations with cultural heritage institutions are planned to enhance the quality and accessibility of public domain works.