Building agents with Google Gemini and open source frameworks

20 MEI 2025
Shrestha Basu Mallick Group Product Manager Gemini API
Philipp Schmid Developer Relations Engineer

The world of AI is buzzing with the potential of AI agents, entities that users can direct to perceive their environment, make decisions, and take actions to achieve specific goals. Google's Gemini models, with their advanced reasoning, multimodality, and function calling capabilities, provide a powerful foundation for building AI Agents. Coupled with a vibrant ecosystem of open-source frameworks, developers now have the toolkit to create sophisticated agentic applications.

This post helps you understand how to build AI agents with Google Gemini models using popular open-source frameworks, including LangGraph, CrewAI, LlamaIndex, or Composio. We touch upon how each framework leverages their strengths for different scenarios.


Why Google Gemini models for your agents?

Gemini models, including the latest Gemini 2.5, offer several advantages for agent development:

  • Advanced Reasoning & Planning: Gemini models excel at logical reasoning and can break down complex tasks into manageable steps, crucial for agentic workflows.

  • Function Calling: The Gemini models native function calling allow agents to interact seamlessly with external tools, APIs, and data sources, enabling them to perform real-world actions.

  • Multimodality: The ability to process and understand various data types (text, images, audio, video, code) opens up new possibilities for agents that can interact with the world in richer ways.

  • Large Context Window: Models like Gemini 2.5 can process up to 1 million tokens (2 million coming soon), allowing agents to maintain context over extended interactions and complex tasks.


Agentic Open Source Framework: A Quick Overview

The choice of framework often depends on the specific requirements of your agent or use cases. Below are some popular options, each offering different strengths and approaches to agent development.

Building agents with Google Gemini and open source frameworks - LangGraph

LangGraph

LangGraph, an extension of LangChain, allows you to build stateful, multi-actor applications by representing workflows as graphs. Each node in the graph is a step (e.g., an LLM call or a tool execution), and edges define the flow of control. LangGraph is excellent for complex, stateful workflows where visibility and control over the agent's reasoning process are critical. When using Google Gemini models with LangGraph, you can benefit from it's advanced reasoning and function calling for each step, enabling iterative reflection and tool use. Get started with LangChain or LangGraph.

Building agents with Google Gemini and open source frameworks - CrewAI

CrewAI

CrewAI is designed for orchestrating, autonomous AI agents that collaborate to achieve complex goals. It simplifies the development of multi-agent systems by allowing you to define agents with specific roles, goals, and backstories, and then assign tasks to them. CrewAI seamlessly integrates with Google Gemini models. By powering your CrewAI agents with Gemini models, you can use its strong reasoning and language understanding for each agent's specialized role, enabling more effective collaboration and task execution. Get started with CrewAI.

Building agents with Google Gemini and open source frameworks - LlamaIndex

LlamaIndex

LlamaIndex is a framework designed for building knowledge agents using LLMs connected to your data. It excels at data ingestion, indexing, and providing retrieval capabilities, letting developers create multi-agent workflows that can automate different types of knowledge work. LlamaIndex offers direct integrations with Gemini models, allowing you to use Gemini for embedding generation, advanced retrieval strategies, and synthesizing responses based on your private data. This is crucial for creating agents that can reason over and answer questions about information not present in the LLM's general training data. LlamaIndex supports both text-only and multimodal Gemini models, enabling RAG over text and images. Get started with LlamaIndex.

Building agents with Google Gemini and open source frameworks - Composio

Composio

Composio is a framework focused on simplifying the integration of external tools and APIs into AI agents. It provides a managed layer for authentication and execution of a wide range of pre-built tools, effectively acting as a universal connector for your agents. This allows developers to quickly give their agents capabilities to interact with services like GitHub, Slack, Google Workspace, Notion, and many others, without needing to manage individual API authentications or build custom tool wrappers. Composio with Google Gemini models leverages Gemini's function calling capabilities to intelligently select and utilize these tools, enabling your agents to perform a vast array of real-world tasks. Get started with Composio.


Best practices and next steps

Ready to start building AI Agents with Google Gemini models today? Here's how:

  • Purpose & Scope: Start with a well-defined goal and the tasks your agent needs to perform.

  • Iterate and Refine Continuously: Agent development is iterative. Start simple, test often, and refine prompts, tools, and logic.

  • Explore Advanced Agentic Patterns: Investigate Agentic Patterns like self-correction, dynamic planning, and memory for more robust agents using our advanced agent design resources.

  • Master Prompt Engineering: Effective prompts are key to unlocking Gemini's agentic capabilities. Take a look at our prompting best practices.

  • Learn & integrate: Dive into Function Calling and comprehensive end-to-end example on how to build Agents with Google Gemini Models.


Explore this announcement and all Google I/O 2025 updates on io.google starting May 22.