Today Google is launching Gemini 3, its smartest AI model yet, and has deployed it immediately across core products. That includes Search, the Gemini app, developer platforms: It’s the first time an entirely new generation of Gemini has gone straight into Search at launch. The challenge is deeper understanding, broader multimodal representation and more sophisticated user-intention awareness — all with the goal of combating brittle, surface-level replies and handling tougher tasks that play out on a longer time horizon.

What Gemini 3 Does New: Key Upgrades and Changes

Reasoning takes center stage. Google claims Gemini 3 dissects more complicated questions with better nuance and context tracking. In community benchmarking, Gemini 3 Pro increased its LMArena leaderboard position to 1501 Elo, outperforming Grok, surpassing Gemini 2.5 Pro. Although Elo is a relative measure, it serves as a convenient proxy for model-vs.-model strength across diverse prompts.

Table of Contents

What Gemini 3 Does New: Key Upgrades and Changes
Deep Think Raises the Ceiling for Complex Reasoning
Where You Can Use Gemini 3 Today Across Products
How to Try Gemini 3 and Access New Capabilities
Why Gemini 3 and Deep Think Matter for Users Now

Academic-style performance leaps, too. More on Webs of Language | Sharing Languages GPT (producing 37.5% generalization with nothing but model-managed context) and GPQA Diamond (a graduate-level knowledge test fictitiously made “immune to web search heuristics”) at a rate of 91.9%. Google tunes state-of-the-art scores for both mathematics and multimodal tasks and achieves substantial improvements in video understanding as well as factual consistency.

But multimodality is in full form here. Built to ingest and reason over text, images and video in the same thread, Gemini 3 also does a better job interpreting screenshots, summarizing long clips and grounding answers in context. Google DeepMind has been leaning in this direction for years; Gemini 3 is its surest step yet toward a single system that can confidently juggle formats.

Deep Think Raises the Ceiling for Complex Reasoning

Gemini 3 brings a more powerful mode called Deep Think for the most challenging reasoning workloads. When Deep Think is enabled, scores rise higher still: 41% on Humanity’s Last Exam and 45.1% on the ARC-AGI-2 challenge, which were designed to test real problem-solving ability instead of simple pattern recall. You should expect slower outputs and higher compute cost when it’s on — this is the mode for complex planning, analysis, and multi-step logic.

Google says Deep Think will begin to roll out in the coming weeks following further safety testing. That meta has precedent in engineering practice as well, such is the case for labs like Google DeepMind and OpenAI, which often gate more powerful modes to sanity-check guardrails against hallucination, misuse and unsafe automation.

Where You Can Use Gemini 3 Today Across Products

Search receives an AI Mode that relies on Gemini 3 for richer generative responses and more nuanced reasoning. Available to subscribers of Google AI Pro and Ultra, it delivers the experience of multi-step problem-solving without having to send people off the results page to float in a separate chat.

A new look for the Gemini app replaces its categorical grid with a cleaner interface, and adds a My Stuff section for your personal content and conversations, as well as more in-depth shopping experiences that combine product discovery, comparison and summarized shopping. The upgraded system for handling multi-step tasks, called Gemini Agent, can now assist you in sorting your inbox and managing appointments and other routine workflows with more transparent status updates and checkpoints.

On the development front, Gemini 3 is now in AI Studio, Vertex AI and the Gemini CLI to be used against a new agentic development platform called Antigravity. That stack enables teams to prototype reasoning-heavy agents, wire them to tools and data and deploy for production with observability and policy controls.

How to Try Gemini 3 and Access New Capabilities

If you can’t wait for any hands-on time, start with Search’s AI Mode through a subscription to Google AI Pro or Ultra. It introduces Gemini 3 to a class of everyday queries, particularly good for the “think out loud” personal tasks like planning an itinerary or a troubleshooting action that involves intermediate reasoning.

Launch the Gemini app on mobile or web; for most users globally the default model is Gemini 3. Experiment with multimodal upgrades in practice through mixed-media prompts — paste a screenshot, add a brief description and request a summary or next steps.

Developers can also move from AI Studio for fast experiments to Vertex AI for enterprise-scale deployment as they require. Local iteration and evaluations are also supported via the Gemini CLI, as Antigravity scaffolds agentive flows like research, triage and follow-ups. Gemini 3 is available to enterprise customers through Vertex AI and Gemini Enterprise, which features governance capabilities and usage analytics.

Deep Think — the topmost mode — will become available after a few safety checks. When it comes to your region and tier, be on the lookout for an explicit toggle or setting, along with usage guidance and quotas.

Why Gemini 3 and Deep Think Matter for Users Now

The pitch for Gemini 3 is not only larger benchmarks; it’s fewer dead ends. More powerful reasoning capabilities and more effective multimodal grounding translate to less time users spend cajoling the model and more time getting answers. For businesses, the intersection of agentic tooling and Search integration hints at some practical wins: faster support triage, smarter internal search, automations that can actually close the loop.

The competition is fierce — community leaderboards like LMArena are among those still pressure-testing claims and rival labs haven’t stopped iterating. But with Gemini 3 woven into Search, apps and cloud tooling from day one, Google’s bet is clear: make advanced reasoning available where people already work, then scale the hardest tasks at a Deep Think level when they need it.