M1 introduces a hybrid linear RNN reasoning model based on the Mamba architecture, designed for scalable test-time computation in solving complex mathematical problems. By leveraging distillation from existing models and reinforcement learning, M1 achieves significant speed and accuracy improvements over traditional transformer models, matching the performance of state-of-the-art distilled reasoning models while utilizing memory-efficient inference techniques.
XBai o4 is the latest fourth-generation open-source large model technology, showcasing enhanced complex reasoning capabilities that surpass OpenAI-o3-mini in Medium mode. It employs a novel reflective generative training form to significantly reduce inference costs and improve response quality. The repository includes training and evaluation code, along with instructions for setup and benchmarks.