Quit Emailing Yourself

2 links tagged with all of: transparency + language-models

Click any tag below to further narrow down your results

Links

Going beyond open data – increasing transparency and trust in language models with OLMoTrace | Ai2

OLMoTrace is a new feature in the Ai2 Playground that allows users to trace the outputs of language models back to their extensive training data, enhancing transparency and trust. It enables researchers and the public to inspect how specific word sequences were generated, facilitating fact-checking and understanding model capabilities. The tool showcases Ai2's commitment to an open ecosystem by making training data accessible for scientific research and public insight into AI systems.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ olmo language-models ✓ transparency ✓ + ai-research + fact-checking

The Common Pile v0.1

EleutherAI has released the Common Pile v0.1, an 8 TB dataset of openly licensed and public domain text for training large language models, marking a significant advancement from its predecessor, the Pile. The initiative emphasizes the importance of transparency and openness in AI research, aiming to provide researchers with essential tools and a shared corpus for better collaboration and accountability in the field. Future collaborations with cultural heritage institutions are planned to enhance the quality and accessibility of public domain works.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ common-pile + dataset + open-source language-models ✓ transparency ✓