The Gemini API offers developers a streamlined way to build innovative applications with cutting-edge generative AI models. Google AI Studio simplifies this process of testing all the API capabilities allowing for rapid prototyping and experimentation with text, image, and even video prompts. When developers want to test and build at scale they can leverage all the capabilities available through the Gemini API.



New models available through the API

Gemini 2.5 Flash Preview - We’ve added a new 2.5 Flash preview (gemini-2.5-flash-preview-05-20) which is better over the previous preview at reasoning, code, and long context. This version of 2.5 Flash is currently #2 on the LMarena leaderboard behind only 2.5 Pro. We’ve also improved Flash cost-efficiency with this latest update reducing the number of tokens needed for the same performance, resulting in 22% efficiency gains on our evals. Our goal is to keep improving based on your feedback, and make both generally available soon.

Gemini 2.5 Pro and Flash text-to-speech (TTS) - We also announced 2.5 Pro and Flash previews for text-to-speech (TTS) that support native audio output for both single and multiple speakers, across 24 languages. With these models, you can control TTS expression and style, creating rich audio output. With multispeaker, you can generate conversations with multiple distinct voices for dynamic interactions.

Gemini 2.5 Flash native audio dialog - In preview, this model is available via the Live API to generate natural sounding voices for conversation, in over 30 distinct voices and 24+ languages. We’ve also added proactive audio so the model can distinguish between the speaker and background conversations, so it knows when to respond. In addition, the model responds appropriately to a user's emotional expression and tone. A separate thinking model enables more complex queries. This now makes it possible for you to build conversational AI agents and experiences that feel more intuitive and natural, like enhancing call center interactions, developing dynamic personas, crafting unique voice characters, and more.

Lyria RealTime - Live music generation is now available in the Gemini API and Google AI Studio to create a continuous stream of instrumental music using text prompts. With Lyria RealTime, we use WebSockets to establish a persistent, real-time communication channel. The model continuously produces music in small, flowing chunks and adapts based on inputs. Imagine adding a responsive soundtrack to your app or designing a new type of musical instrument! Try out Lyria RealTime with the PromptDJ-MIDI app in Google AI Studio.

Gemini 2.5 Pro Deep Think - We are also testing an experimental reasoning mode for 2.5 Pro. We’ve seen incredible performance with these Deep Thinking capabilities for highly complex math and coding prompts. We look forward to making it broadly available for you to experiment with soon.

Gemma 3n - Gemma 3n is a generative AI open model optimized for use in everyday devices, such as phones, laptops, and tablets. It can handle text, audio and vision inputs. This model includes innovations in parameter-efficient processing, including Per-Layer Embedding (PLE) parameter caching and a MatFormer model architecture that provides the flexibility to reduce compute and memory requirements.



New functionality in the API

Thought summaries

To help developers understand and debug model responses, we’ve added thought summaries for 2.5 Pro and Flash in the Gemini API. We take the model’s raw thoughts and synthesize them into a helpful summary with headers, relevant details and tool calls. The raw chain-of-thoughts in Google AI Studio has also been updated with the new thought summaries.



Thinking budgets

We launched 2.5 Flash with thinking budgets to provide developers control over how much models think to balance performance, latency, and cost for the apps they are building. We will be extending this capability to 2.5 Pro soon.