Today, we are giving developers access to the 2 million context window for Gemini 1.5 Pro, code execution capabilities in the Gemini API, and adding Gemma 2 in Google AI Studio.
At I/O, we announced the longest ever context window of 2 million tokens in Gemini 1.5 Pro behind a waitlist. Today, we’re opening up access to the 2 million token context window on Gemini 1.5 Pro for all developers.
As the context window grows, so does the potential for input cost. To help developers reduce costs for tasks that use the same tokens across multiple prompts, we’ve launched context caching in the Gemini API for both Gemini 1.5 Pro and 1.5 Flash.
LLMs have historically struggled with math or data reasoning problems. Generating and executing code that can reason through such problems helps with accuracy. To unlock these capabilities for developers, we have enabled code execution for both Gemini 1.5 Pro and 1.5 Flash. Once turned on, the code-execution feature can be dynamically leveraged by the model to generate and run Python code and learn iteratively from the results until it gets to a desired final output. The execution sandbox is not connected to the internet, comes standard with a few numerical libraries, and developers are simply billed based on the output tokens from the model.
This is our first step forward with code execution as a model capability and it’s available today via the Gemini API and in Google AI Studio under “advanced settings”.
We want to make AI accessible to all developers, whether you’re looking to integrate our Gemini models via an API key or using our open models like Gemma 2. To help developers get hands-on with the Gemma 2 model, we’re making it available in Google AI Studio for experimentation.
Gemini 1.5 Flash was built to address developers’ top request for speed and affordability. We continue to be excited by how developers are innovating with Gemini 1.5 Flash and using the model in production:
In line with our previous announcement last month, we're working hard to make tuning for Gemini 1.5 Flash available to all developers, to enable new use cases, additional production robustness and higher reliability. Text tuning in 1.5 Flash is now ready for red-teaming and will be rolling out gradually to developers starting today. All developers will be able to access Gemini 1.5 Flash tuning via the Gemini API and in Google AI Studio by mid-July.
We are excited to see how you use these new features, you can join the conversation on our developer forum. If you’re an enterprise developer, see how we’re making Vertex AI the most enterprise-ready genAI platform.