Following the exciting launches of Gemma 3 and Gemma 3 QAT, our family of state-of-the-art open models capable of running on a single cloud or desktop accelerator, we're pushing our vision for accessible AI even further. Gemma 3 delivered powerful capabilities for developers, and we're now extending that vision to highly capable, real-time AI operating directly on the devices you use every day – your phones, tablets, and laptops.
To power the next generation of on-device AI and support a diverse range of applications, including advancing the capabilities of Gemini Nano, we engineered a new, cutting-edge architecture. This next-generation foundation was created in close collaboration with mobile hardware leaders like Qualcomm Technologies, MediaTek, and Samsung System LSI, and is optimized for lightning-fast, multimodal AI, enabling truly personal and private experiences directly on your device.
Gemma 3n is our first open model built on this groundbreaking, shared architecture, allowing developers to begin experimenting with this technology today in an early preview. The same advanced architecture also powers the next generation of Gemini Nano, which brings these capabilities to a broad range of features in Google apps and our on-device ecosystem, and will become available later this year. Gemma 3n enables you to start building on this foundation that will come to major platforms such as Android and Chrome.
Gemma 3n leverages a Google DeepMind innovation called Per-Layer Embeddings (PLE) that delivers a significant reduction in RAM usage. While the raw parameter count is 5B and 8B, this innovation allows you to run larger models on mobile devices or live-stream from the cloud, with a memory overhead comparable to a 2B and 4B model, meaning the models can operate with a dynamic memory footprint of just 2GB and 3GB. Learn more in our documentation.
By exploring Gemma 3n, developers can get an early preview of the open model’s core capabilities and mobile-first architectural innovations that will be available on Android and Chrome with Gemini Nano.
In this post, we'll explore Gemma 3n's new capabilities, our approach to responsible development, and how you can access the preview today.
Engineered for fast, low-footprint AI experiences running locally, Gemma 3n delivers:
Gemma 3n will empower a new wave of intelligent, on-the-go applications by enabling developers to:
2. Power deeper understanding and contextual text generation using combined audio, image, video, and text inputs—all processed privately on-device.
3. Develop advanced audio-centric applications, including real-time speech transcription, translation, and rich voice-driven interactions.
Here’s an overview and the types of experiences you can build:
Link to Youtube Video (visible only when JS is disabled)
Our commitment to responsible AI development is paramount. Gemma 3n, like all Gemma models, underwent rigorous safety evaluations, data governance, and fine-tuning alignment with our safety policies. We approach open models with careful risk assessment, continually refining our practices as the AI landscape evolves.
We're excited to get Gemma 3n into your hands through a preview starting today:
Initial Access (Available Now):
Gemma 3n marks the next step in democratizing access to cutting-edge, efficient AI. We’re incredibly excited to see what you’ll build as we make this technology progressively available, starting with today's preview.
Explore this announcement and all Google I/O 2025 updates on io.google starting May 22.