New Gemini API updates for Gemini 3

NOV. 25, 2025

Shrestha Basu Mallick Product Google DeepMind

Philipp Schmid Developer Relations Engineer

Gemini 3, our most intelligent model is available for developers to build with via the Gemini API. To support its state-of-the-art reasoning, autonomous coding and multimodal understanding and powerful agentic capabilities, we have rolled out several updates to the Gemini API. These changes are designed to give you more control over how the model reasons, how it processes media, and how it interacts with the outside world.

Here is what is new in the Gemini API for Gemini 3

Simplified parameters for thinking control: Starting Gemini 3 onwards, we are introducing a new parameter called thinking_level to control the maximum depth of the model’s thinking process before it produces a response. Gemini 3 treats these levels as relative guidelines for reasoningrather than strict token guarantees. The thinking_level parameter allows you to adjust the depth of the model's internal reasoning. You can set it to "high" for complex tasks that require optimal thinking (e.g. strategic business analysis, or scanning code for vulnerabilities). You can set it to "low" for latency and cost-sensitive applications such as structured data extraction, summarization etc. Read more here

Granular control over multimodal vision processing: The media_resolution parameter lets you configure how many tokens are used for image, video and document inputs , allowing you to balance visual fidelity with token usage. The resolution can be set using media_resolution_low, media_resolution_medium, or media_resolution_high per individual media part or globally. If unspecified, the model uses optimal defaults based on the media type. Higher resolutions improve the model's ability to read fine text or identify small details, but increase token usage and latency.

Thought signatures to improve function calling and image generation performance: Starting with Gemini 3, we are enforcing the return of "Thought Signatures.". These are encrypted representations of the model's internal thought process. By passing these signatures back to the model in subsequent API calls, you ensure that Gemini 3 maintains its chain of reasoning across a conversation. This is critical for complex, multi-step agentic workflows where preserving the "why" behind a decision is just as important as the decision itself. If you use the official SDKs and standard chat history, thought signatures are handled automatically for you. On the API the validations work as follows
- Function Calling has strict validation on the “current turn”. Missing signatures will result in a 400 error. For understanding of how signatures show up for various function calling scenarios please read here
- For Text/chat generation, validation is not strictly enforced, but omitting signatures will degrade the model's reasoning and answer quality.
- Image generation/editing has strict validation for all model parts including a thoughtSignature. Missing signatures will result in a 400 error.

Grounding and URL Context with Structured Outputs: You can now combine Gemini hosted tools, specifically Grounding with Google Search and URL context with structured outputs. This is especially powerful for building agents that need to fetch live information from the web or specific webpages and extract that data into a precise JSON format for downstream tasks.

Updates to Grounding with Google Search pricing: To better support dynamic agentic workflows, we are transitioning our pricing model from a flat rate (US$35/1k prompts) to a more granular, usage-based rate of US$14 per 1,000 search queries.

Best practices for using Gemini 3 Pro through our APIs

We have seen wide excitement for Gemini 3 Pro especially with vibe coding, zero-shot generation, mathematical problem solving, complex multimodal understanding challenges and a variety of other use cases. In order to get the best results while pushing the boundaries of Gemini 3. More details here.

Temperature: We strongly recommend keeping the temperature parameter at its default value of 1.0
Consistency & Defined Parameters: Maintain a uniform structure throughout your prompts (e.g., standardized XML tags) and explicitly define ambiguous terms.
Output Verbosity: By default, Gemini 3 is less verbose and prefers providing direct, efficient answers. If you require a more conversational or "chatty" response, you must explicitly ask for it.
Multimodal Coherence: Text, images, audio, or video should all be treated as equal-class inputs. Instructions should reference specific modalities clearly to ensure the model synthesizes across them rather than analyzing them in isolation.
Constraint Placement: Place behavioral constraints and role definitions in the System Instruction or at the very top of the prompt to ensure they anchor the model's reasoning process.
Long Context Structure: When working with large contexts (books, codebases, long videos), place your specific instructions at the end of the prompt (after the data context).

Gemini 3 Pro is our most advanced model for agentic coding. To help developers get the best of its capabilities, we’ve worked with our research team to create a System Instructions template for the model that improved performance on several agentic benchmarks.

To start building with these new features, check out the Gemini 3 documentation and read the Developer Guide for technical implementation details.

posted in:

AI Announcements Best Practices

Introducing Metrax: performant, efficient, and robust model evaluation metrics in JAX

NOV. 13, 2025

Mobile AI Announcements Industry Trends

Unlocking Peak Performance on Qualcomm NPU with LiteRT

NOV. 24, 2025

AI Best Practices

Building AI Agents with Google Gemini 3 and Open Source Frameworks

NOV. 19, 2025

AI Announcements Solutions

Build with Google Antigravity, our new agentic development platform

NOV. 20, 2025