State-of-the-art text embedding via the Gemini API

MAR 07, 2025

Logan Kilpatrick Senior Product Manager Gemini API and Google AI Studio

Zach Gleicher Product Manager Google DeepMind

Parashar Shah Product Manger Google Cloud

_{[Image created by Google with Gemini 2.0 Flash native image generation]}

Today, we’re making a new experimental Gemini Embedding text model (gemini-embedding-exp-03-07)¹ available in the Gemini API.

Trained on the Gemini model itself, this embedding model has inherited Gemini’s understanding of language and nuanced context making it applicable for a wide range of uses. This new embedding model surpasses our previous state-of-the-art model (text-embedding-004), achieves the top rank on the Massive Text Embedding Benchmark (MTEB) Multilingual leaderboard, and comes with new features like longer input token length!

Our most capable text embedding model yet

We've trained our model to be remarkably general, delivering exceptional performance across diverse domains, including finance, science, legal, search, and more. It works effectively out-of-the-box, eliminating the need for extensive fine-tuning for specific tasks.

The MTEB (Multilingual) leaderboard ranks text embedding models across diverse tasks such as retrieval and classification to provide a comprehensive benchmark for model comparison. Our Gemini Embedding model achieves a mean (task) score of 68.32–a margin of +5.81 over the next competing model.

MTEB Leaderboard text model performance ranking

Our new Gemini text embedding model (gemini-embedding-exp-03-07) achieves high scores on the MTEB (Multilingual) leaderboard (right click to open image in new tab).

Why embeddings?

From building intelligent retrieval augmented generation (RAG) and recommendation systems to text classification, the ability for LLMs to understand the meaning behind text is crucial. Embeddings are often critical for building more efficient systems, reducing cost and latency while also generally providing better results than keyword matching systems. Embeddings capture semantic meaning and context through numerical representations of data. Data with similar semantic meaning have embeddings that are closer together. Embeddings enable a wide range of applications, including:

Efficient Retrieval: Find relevant documents within large databases, like legal document retrieval or enterprise search, by comparing the embeddings of queries and documents.

Retrieval-Augmented Generation (RAG): Enhance the quality and relevance of generated text by retrieving and incorporating contextually relevant information into the context of a model.

Clustering and Categorization: Group similar texts together, identifying trends and topics within your data.

Classification: Automatically categorize text based on its content, such as sentiment analysis or spam detection.

Text Similarity: Identify duplicate content, enabling tasks like web page deduplication or plagiarism detection.

You can learn more about embeddings and common AI use cases in the Gemini API docs.

Get started with Gemini Embedding

Developers can now access our new, experimental Gemini Embeddings model through the Gemini API. It’s compatible with the existing embed_content endpoint.

from google import genai

client = genai.Client(api_key="GEMINI_API_KEY")

result = client.models.embed_content(
        model="gemini-embedding-exp-03-07",
        contents="How does alphafold work?",
)

print(result.embeddings)

In addition to improved quality across all dimensions, Gemini Embedding also features:

Input token limit of 8K tokens. We’ve improved our context length from previous models allowing you to embed large chunks of text, code, or other data.

Output dimensions of 3K dimensions. High-dimensional embeddings with almost 4x more tokens over previous embedding models.

Matryoshka Representation Learning (MRL): MRL allows you to truncate the original 3K dimensions to scale down to meet your desired storage cost.

Expanded language support. We’ve doubled the number of languages supported to over 100.

Unified model. This model surpasses the quality of our previous task-specific multilingual, english-only, and code specific models.

While currently in an experimental phase with limited capacity, this release gives you an early opportunity to explore Gemini Embedding capabilities. As with all experimental models, it's subject to change, and we're working towards a stable, generally available release in the months to come. We’d love to hear your feedback on the embeddings feedback form.

¹_{On Vertex AI, the same model is served through the endpoint “}_{text-embedding-large-exp-03-07}_{.” For general availability, naming will be consistent.}

Publicaciones relacionadas

Gemini AI Case Studies

CalCam: Transforming Food Tracking with the Gemini API

5 de marzo de 2025

Gemini AI Announcements Industry Trends

Start building with Gemini 2.0 Flash and Flash-Lite

25 de febrero de 2025

Gemma AI Announcements Industry Trends

Presentamos PaliGemma 2 Mix: un modelo de lenguaje-visión para varias tareas

19 de febrero de 2025

Gemini Google AI Studio AI How-To Guides

Gemini 2.0 Deep Dive: Code Execution

6 de marzo de 2025