Google is sneakily adding an option to mark up images directly in Gemini, enabling people to annotate images that will help the AI understand your desired focus area and speed up quick visual callouts. The feature is appearing for some users in the Gemini app on Android and on desktop using Chrome, which indicates a staged testing before it launches widely.

Early testers have told of an onboarding prompt the first time they attach a photo and tools which allow for both light editing and focused analysis. In other words, you can identify precisely what’s important in an image, and then ask better questions about it.

Table of Contents

What the new Gemini image markup tools can do today
Where the Gemini image markup feature is appearing now
What the change means for multimodal AI and reliability
Practical use cases for Gemini’s new image markup tools
Privacy and safety notes for annotated images in Gemini
What to watch next as Google expands image markup access

What the new Gemini image markup tools can do today

In a Gemini chat, you can include an image and add rudimentary annotations — highlights or region callouts, for example — to indicate the specific part of the picture in question. Gemini treats those attributes as signals of intent, stripping out some of the guesswork that can infiltrate multimodal prompts.

The real-world applications are clear, from circling a component on a circuit-board image and asking “what’s this part” to highlighting a particular panel of a chart and demanding “explain why the spike happens here,” or even pointing at the leftmost person in a group photo and saying “say something about their outfit.” In early tests, the model consistently correctly identified which region was being referred to even when its identification was less than perfect — a sign that attention steering is functioning as it’s meant to.

Even better, the same canvas can serve as a quick-edit layer for you to jot notes or visual emphasis on an image before sharing it.

It’s no full-fledged editor, but a way to communicate complex instructions, document an issue or highlight some detail the assistant — and your team members — shouldn’t miss.

Where the Gemini image markup feature is appearing now

Availability looks limited for now. The feature is live for some users in the Gemini Android app and in Gemini on the web when used via Chrome, according to reports. Just like many other Google rollouts, this seems to be a server-side A/B test, so app version by itself may not grant access.

If you are part of the test, the first time that you add a photo, you should receive a notification to that effect, along with an icon labeled Markup or an option for Edit. There’s no announced schedule for a full release, and some regional or account-specific limitations may apply as Google ramps capacity.

What the change means for multimodal AI and reliability

Grounding language with references to space has been shown to increase model reliability. Academic work carried out by Google Research and others has demonstrated that such a combination of textual prompts and direct references to regions in the image has led to more robust object grounding and instruction following. In reality, it’s almost always faster and clearer to point than describe.

The shift is also part of a larger movement toward visual-first workflows. Google has said its technology, called Lens, now performs billions of visual searches a month, and the most recent Gemini models brought huge context windows for text-and-image combinations. Markup is the connective tissue between those capabilities — providing an easy way for users to grab another user’s attention within some winding scene without having to write a bunch of complex cues.

Practical use cases for Gemini’s new image markup tools

For everyday tasks, annotation is a must for product comparisons, translations on labels or menu callouts, or indicating a defective part when you are troubleshooting. Instead of “Is it this on the right,” you can simply indicate.

For work, teams can annotate on top of screenshots, mockups or dashboards and request that Gemini summarize feedback, log action items or flag anomalies. If these tools reach the workplace as they did with Workspace, they may be able to close review loops between design, QA and support without the tedium of making everyone switch editing apps.

Privacy and safety notes for annotated images in Gemini

Images and markup processed in the cloud are not deleted unless you manage controls from your account. Users who are in a sensitive setting must verify organizational rules and data retention for sharing proprietary visuals with its AI assistant.

And while targeting is improved with region selection, the model can still make mistakes identifying people or things. Verify the results on anything that has consequences, especially in fields like medicine, law or compliance where accuracy and consent are everything.

What to watch next as Google expands image markup access

But questions remain about how soon Google will broaden access, whether mobile and web results will be synchronized, or how the tools handle images, PDFs and long-lived conversation threads. Such integration with Android features — Circle to Search, more seamless Google Photos integrations — would put markup pretty much everywhere on the platform.

It is one of Gemini’s most concrete quality-of-life improvements this year, should the feature have surfaced for you, and it moves multimodal AI from clever demos to useful everyday utility.