Click any tag below to further narrow down your results
Links
The author frames tokenizer design as an integer linear program, relaxes it to a continuous LP, and uses cutting planes to close the gap between fractional and integral solutions. They automate cut discovery with Codex, apply cycle constraints on overlapping token edges, and report provably optimal tokenizers on small pretokenized datasets.
This article explains AutoRAG, a framework designed to improve Retrieval-Augmented Generation (RAG) systems by treating their design as an optimization problem. It highlights the importance of evaluating entire pipelines rather than focusing on isolated components, emphasizing how effective query reformulation and context expansion enhance answer quality.