Click any tag below to further narrow down your results
Links
This article analyzes the developments in China's open-source AI ecosystem since the "DeepSeek Moment" in early 2025. It highlights the strategic shifts of major companies like Alibaba, Tencent, and ByteDance, as well as the broader collaborative efforts that have emerged, shaping the future of AI in the country.
DeepSeek's AI model, DeepSeekMath-V2, earned gold by solving five of six problems at the International Mathematics Olympiad 2025. The model is open-source under the Apache 2.0 license, enhancing access to advanced mathematical AI tools.
This article reviews key milestones in Chinese AI throughout 2025, highlighting significant model launches, shifts in the AGI discussion, and developments in the US-China chip war. It emphasizes the impact of DeepSeek and the emergence of open-source models, as well as the international ambitions of companies like Manus.
This article discusses advancements in the Deepseek model, highlighting reduced attention complexity and innovations in reinforcement learning training. It also critiques the assumptions surrounding open-source large language models and questions the benchmarks used to evaluate their performance.
This article discusses advancements made by Deepseek in reducing attention complexity and improving reinforcement learning training. Key points include their unique approach to context management and task/environment creation, as well as their critique of the open-source LLM landscape.
Baidu is making its Ernie generative AI model open source, marking a significant shift in China's tech sector and putting pressure on competitors like OpenAI and Anthropic. Experts believe this move could disrupt pricing dynamics in the AI market, as it offers powerful models at lower costs, although skepticism about security and trust in Chinese technology remains.
TNG Technology Consulting GmbH has unveiled R1T2, a new variant of DeepSeek R1-0528 that operates 200% faster while maintaining high reasoning performance. With significant reductions in output token count and inference time, R1T2 is tailored for enterprise applications, offering an open-source solution under the MIT License.
DeepSeek has launched its Terminus model, an update to the V3.1 family that improves agentic tool use and reduces language mixing errors. The new version enhances performance in tasks requiring tool interaction while maintaining its open-source accessibility under an MIT License, challenging proprietary models in the AI landscape.
The DeepSeek-R1-GGUF model repository on Hugging Face hosts large datasets and model files for text generation tasks, specifically utilizing the DeepSeek architecture. It includes multiple versions of the model, all under an MIT license, and is part of a community-driven project by Unsloth AI.
DeepSeek V3.1 has emerged as a powerful open AI model, capable of processing extensive context while integrating chat, reasoning, and coding functions seamlessly. Its open-source approach challenges traditional AI business models by providing high-performance capabilities at significantly lower costs, promoting wider accessibility and innovation in AI development.