Quit Emailing Yourself

# software-engineering → reinforcement-learning

2 links tagged with all of: software-engineering + reinforcement-learning

Links

GitHub - MiniMax-AI/MiniMax-M1: MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

MiniMax-M1 is a groundbreaking open-weight hybrid-attention reasoning model featuring a Mixture-of-Experts architecture and lightning attention mechanism, optimized for handling complex tasks with long inputs. It excels in various benchmarks, particularly in mathematics, software engineering, and long-context understanding, outperforming existing models with efficient test-time compute scaling. The model is trained through large-scale reinforcement learning and offers function calling capabilities, positioning it as a robust tool for next-generation AI applications.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

+ minimax + hybrid-attention reinforcement-learning ✓ + large-scale software-engineering ✓

moonshotai/Kimi-Dev-72B · Hugging Face

Kimi-Dev-72B is an advanced open-source coding language model designed for software engineering tasks, achieving a state-of-the-art performance of 60.4% on the SWE-bench Verified benchmark. It leverages large-scale reinforcement learning to autonomously patch real repositories and ensures high-quality solutions by only rewarding successful test suite completions. Developers and researchers are encouraged to explore and contribute to its capabilities, available for download on Hugging Face and GitHub.

Saved by tldr-importer · Last saved October 29, 2025 · 1 min read

+ coding + llm + open-source reinforcement-learning ✓ software-engineering ✓