Click any tag below to further narrow down your results
Links
This paper argues that traditional academic articles hide failed experiments and leave out key implementation details, creating a “narrative tax” and an “engineering tax” that limit reproducibility. It proposes replacing static papers with ARA research packages—complete, executable bundles containing code, pipelines, and failure logs—so AI agents can fully understand and build on the work.
This article digs into why repeated LLM calls can produce different outputs even at zero temperature. It shows that floating-point non-associativity and kernel implementation details—rather than thread scheduling or atomic adds—are the real sources of run-to-run variation and outlines ways to make inference fully reproducible.