Recursive Language Models (RLMs) are introduced as a novel inference strategy allowing language models to recursively interact with unbounded input context through REPL environments. This approach aims to mitigate the context rot phenomenon and improve performance on long-context benchmarks, showing promising early results that suggest RLMs may enhance general-purpose inference capabilities.
M1 introduces a hybrid linear RNN reasoning model based on the Mamba architecture, designed for scalable test-time computation in solving complex mathematical problems. By leveraging distillation from existing models and reinforcement learning, M1 achieves significant speed and accuracy improvements over traditional transformer models, matching the performance of state-of-the-art distilled reasoning models while utilizing memory-efficient inference techniques.