Home > MarketWatch > Industry News
Google enlarges the move: Gemini Robotics-ER 1.5 is online!
Time:2025-10-12

26268422-SRmMwM.jpg?auth_key=1760284799-

Recently, Google released its most advanced robot embodied inference model, Gemini Robotics-ER 1.5. This is the first Gemini robot series model to be widely open to all developers, and it will serve as the robot's advanced reasoning brain.


Gemini Robotics-ER 1.5 (Gemini Robotics-Embodied Reasoning) is a vision-language model (VLM) that brings Gemini's agent capabilities to robotics. Gemini Robotics-ER 1.5 is a thinking model that can reason about the physical world, call tools natively, and plan logical steps to complete tasks.


01


|What's new in Gemini Robotics-ER 1.5?

This is an upgraded version of Google's AI model specially created for robots, which is smarter, more flexible, and safer, allowing robots to "understand, understand, and do the right thing."


1. The sense of space is stronger, and it refers to where to hit

"Understand" the surrounding environment like a person and can quickly judge: what can be picked up? Which ones are too heavy or unsteady? It can accurately generate 2D coordinate points, for example, if you say, "Point out everything you can take", it can accurately point out one by one. And the response is fast, using a lightweight Gemini Flash model, low latency, no lag.


2. Be able to plan complex tasks and "use hands + brains" like a human

You can complete long tasks that require "multi-step operation", such as: "Rearrange my desk according to this photo" can not only be seen, but also think about how to move, put it first, and put it later, and plan it step by step. Can also be "networked + tuned function", need to check information? Call Google search directly. Want to sort garbage? Third-party programs such as local garbage classification rules can be invoked.


3. Flexible control of "thinking time"

Can developers adjust it faster or more accurately? Complex tasks (such as assembling robots) make AI "think a little longer", and the results are more reliable. Simple tasks (such as referring to an object) require "immediate response" without delay. Just like people do things: think more about important things and react to small things in seconds.


4. Safer and don't do the "can't do"

A new "safety filter" has been added to identify unrealistic instructions. For example: "lift a car" - if the robot can't lift it, the AI will directly say "no". "Through a wall" - against the laws of physics, the AI will also refuse. Avoid blind execution and accidents by robots, so that developers can use it with more peace of mind.


02


The "super brain" of robots

Gemini Robotics-ER 1.5 doesn't just listen to commands, but really understands complex instructions, such as if you say, "clean up the desk", it will not be dumbfounded, but will automatically disassemble the task and first determine what is on the table - books, cups, scraps of paper...... Which ones should be collected and which should be thrown away.


Make an action plan, take the cup first, then put the book, and finally clean up the trash. Call the appropriate tools to control the hardware of the robotic arm. Launch a special "grab something" AI model. Use the VLA model of "Look at the picture and speak + action control" to operate accurately. The whole process is done in one go, like a person "watching, thinking and doing".


To work in the real world, robots must be "accurate and pointable". Gemini Robotics-ER 1.5 is particularly good at this: it accurately determines the position of objects and generates accurate 2D coordinates (like anchors on a map). For example, if you ask the robot to "point to the water cup", it can accurately point to the handle or center of the cup with minimal error. Currently, it has the highest pointing accuracy of any visual language model.


To put it simply: with this "brain", robots are no longer just machines that execute dead commands, but intelligent assistants that can understand, plan and operate accurately, and are one step closer to the goal of "obedient and easy to use".


26268422-eXZUed.jpg?auth_key=1760284799-


03


The future direction of the robotics industry: using AI to move towards "embodied intelligence"

Every move of the head company points to a general trend: future robots must have their own "brains" - self-developed AI models are the only way to general intelligence. Only a very small number of enterprises with the following capabilities can really go far. Full chain technical capabilities (from hardware to software). It can integrate chips, algorithms, data and other resources. Have patience and strategy for long-term commitment.


"Brain + cerebellum" is the next generation of robots. Today's robot industry no longer relies solely on "writing dead programs" to control actions. Everyone is turning to using large models to break through the bottlenecks of traditional control, such as: using large language models to understand human instructions; perceive the environment through autonomous driving models; Rely on multimodal models to integrate vision, speech, and action.


The focus of the industry has also changed: from "single action" to creating an intelligent system of "brain (decision-making) + cerebellum (coordination)".


Who can win? In the future, only a very small number of companies with comprehensive technology and far-reaching layout will be able to "converge" various technologies into a set of standards and truly define what "embodied intelligence" is—that is, general-purpose robots that can understand the world, learn independently, and act flexibly.


In a word, robots without AI brains can only be regarded as "machines" in the future; Those with brains are called "agents".


This competition has just begun.


04


Three main lines, seize the dividend of "robot intelligence"

1. Pay attention to the "AI upgrade" opportunities of robot ontology manufacturers

Existing hardware + access to Gemini ER = Smart Leap. Positive: Service robots (home, medical, cleaning); industrial collaborative robots (such as UBTECH, Estun, Jieka); Autonomous driving companies (also requiring physical reasoning capabilities).


2. Layout of "AI + robot" middleware and platform companies

ER 1.5 is the "brain" and also needs the "nervous system" to connect the hardware. Attention: ROS (Robot Operating System) ecological enterprises; Robot middleware, simulation platform, AI integration service provider.


3. Long-term bet on "embodied intelligence" application scenarios

Home assistant robot: "Help me find keys, hot meals, close windows"; medical care robots: "assist the elderly in sitting up and delivering medicine"; Warehousing and logistics robots: "autonomous sorting and response to abnormal packages". Which scenarios require "advanced reasoning" the most are most likely to break out first.


TEL:
18117862238
Email:yumiao@jt-capital.com.cn
Address:20th floor, Taihe · international financial center, high tech Zone, Chengdu

Copyright © 2021 jt-capital.com.cn All Rights Reserved 

Copyright: JamThame capital 粤ICP备2022003949号-1  

LINKS

Copyright © 2021 jt-capital.com.cn All Rights Reserved 

Copyright: JamThame capital 粤ICP备2022003949号-1