Start building with Gemini 2.0 Flash and Flash-Lite

FEB. 25, 2025

Logan Kilpatrick Group Product Manager

Shrestha Basu Mallick Product Google DeepMind

Since the launch of the Gemini 2.0 Flash model family, developers are discovering new use cases for this highly efficient family of models. Gemini 2.0 Flash offers stronger performance over 1.5 Flash and 1.5 Pro, plus simplified pricing that makes our 1 million token context window more affordable.

Today, Gemini 2.0 Flash-Lite is now generally available in the Gemini API for production use in Google AI Studio and for enterprise customers on Vertex AI. 2.0 Flash-Lite offers improved performance over 1.5 Flash across reasoning, multimodal, math and factuality benchmarks. For projects that require long context windows, 2.0 Flash-Lite is an even more cost-effective solution, with simplified pricing for prompts more than 128K tokens.

Developers are already leveraging the speed, efficiency, and cost-effectiveness of the 2.0 Flash family to build incredible applications. Here are a few examples:

1. Voice AI

Building effective conversational AI, particularly voice assistants, requires both speed and accuracy. A fast Time-to-First-Token (TTFT) is essential for creating a natural, responsive feel, alongside the ability to handle complex instructions and interact with other systems via function calling.

Daily is leveraging Gemini 2.0 Flash-Lite to help developers create cutting-edge voice AI experiences. Using their open-source, vendor agnostic Pipecat framework for voice and multimodal conversational agents, Daily has created a system instruction code demo to reliably detect voicemail systems and tailor messages accordingly.

Gemini 2.0 Flash-Lite, with the above system instruction, performs significantly better than current specialized commercial models for detecting voicemail.

2. Data analytics

Dawn is revolutionizing how engineering teams monitor their AI products in production by providing deep, meaningful insights powered by Gemini 2.0 Flash. Dawn's "semantic monitoring" pipeline allows engineering teams to instantly search massive streams of user interactions to find any behavior they’re looking for—like user frustration, conversation length, and user feedback—and continuously track them as ongoing issues or topics to identify anomalies and hidden problems in production.

With Gemini 2.0 Flash's simplified pricing, reliable structured outputs, and extended context capabilities, Dawn was able significantly reduce search times (from hours to just under a minute) by switching models, cut costs by more than 90%, and see increased reliability across evals and production monitoring.

Gemini 2.0 Flash makes Dawn’s semantic monitoring faster, more reliable, and cost effective.

3. Video editing

Mosaic is transforming complex, time-consuming video editing tasks with a new, agentic paradigm that uses Gemini 2.0 Flash. Their solution incorporates multimodal editing agents that use Gemini 2.0 Flash’s long-context capabilities to accelerate mundane video editing tasks from hours to seconds so you can do things like clip YouTube Shorts from any part of a long form video with just a prompt.

The new simplified pricing for Gemini 2.0 Flash of $0.10 per 1 million input tokens in Google AI Studio makes huge context windows 33% more affordable, opening up new possibilities for AI-driven video editing workflows.

Using Gemini 2.0 Flash, Mosaic’s agentic workflow cuts and edits a YouTube Short from a recent episode of Release Notes.

Start building with Gemini 2.0 Flash and 2.0 Flash-Lite

We’re excited by what the Gemini 2.0 Flash family of models is enabling for developers like Daily.co, Mosaic, and Dawn. Whether you're working on voice assistants, video editing tools, or something completely new, we hope the Gemini 2.0 Flash family provides the performance and affordability you need. Start building today in Google AI Studio.