5 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Baidu released the ERNIE-4.5-VL-28B-A3B-Thinking, an AI model that claims to outperform Google and OpenAI’s offerings in visual reasoning while using fewer computing resources. The model features a unique dynamic image analysis capability that mimics human problem-solving. It’s designed for enterprise applications, including document processing and manufacturing quality control.
If you do, here's more
Baidu has launched a new AI model called ERNIE-4.5-VL-28B-A3B-Thinking, claiming it outperforms Google and OpenAI's offerings on vision-related tasks while using far less computational power. The model operates using only 3 billion parameters at a time, despite having a total of 28 billion, thanks to a routing architecture that allows it to achieve high efficiency. This design helps it excel in various tasks like document understanding and visual reasoning, making it appealing for enterprise applications that require both speed and accuracy.
One standout feature is its "Thinking with Images" capability, which allows the AI to zoom in and out of images dynamically, mimicking human visual problem-solving. This flexibility enables the model to analyze complex diagrams or detect defects in manufacturing. Baidu claims its visual grounding abilities enhance instruction execution in industrial scenarios, potentially benefiting robotics and warehouse automation. While the performance claims are impressive, independent verification is still pending.
Baidu's decision to release the model under the Apache 2.0 license makes it more accessible for commercial use compared to competitors with stricter licensing. The model incorporates advanced training techniques and a mixture-of-experts architecture to optimize efficiency, allowing it to run on a single 80GB GPU. This makes it an attractive option for companies looking to integrate AI into their operations. Overall, Baidu's new model is part of a larger suite of ERNIE models designed to improve multimodal understanding without sacrificing performance in any specific area.
Questions about this article
No questions yet.