3 links
tagged with all of: multimodal + gemini
Click any tag below to further narrow down your results
Links
Google has released updated versions of the Gemini 2.5 Flash and Flash-Lite models, enhancing quality and efficiency with significant reductions in output tokens and improved capabilities in instruction following, conciseness, and multimodal functions. The updates aim to facilitate better performance in complex applications while allowing users to easily access the latest models through new aliases.
Gemini models 2.5 Pro and Flash are revolutionizing robotics with advanced coding, reasoning, and multimodal capabilities, enhancing robots' spatial understanding. Developers can utilize these models and the Live API for applications such as semantic scene understanding, spatial reasoning, and interactive robotics, enabling robots to execute complex tasks through voice commands and code generation. The article highlights practical examples and the potential of Gemini's embodied reasoning model in various robotics applications.
Gemini 2.5 Pro Preview has been released ahead of schedule, featuring enhanced capabilities for coding and building interactive web apps. This update builds on positive feedback from the previous version, improving performance in UI development, code transformation, and multimodal reasoning, and now leads the WebDev Arena Leaderboard. Developers can access these features through the Gemini API and Google AI Studio.