Data Science Agent in Colab: The future of data analysis with Gemini

MAR 03, 2025
Jane Fine Senior Product Manager
Mahi Kolla Associate Product Manager
Ilai Soloducho Senior Technical Program Manager

Google Colab is a free, cloud-hosted Jupyter Notebook environment where you can write and run Python code directly in your browser. It provides access free of charge to Google Cloud GPUs and TPUs, which is a game-changer for running AI models and simplifies project collaboration.

In December, we shared how the Data Science Agent in Colab creates notebooks for trusted testers using Gemini, removing tedious setup tasks like importing libraries, loading data, and writing boilerplate code. Trusted testers are enthusiastic about the Data Science Agent, reporting they are able to streamline workflows and uncover insights faster than ever before.

Today, we’re excited to bring Data Science Agent to Colab users age 18+ and in select countries and languages. This expands our university partnerships to help research labs save time on data processing and analysis by generating complete, working Colab notebooks from simple natural language descriptions.


Here's how the Data Science Agent works:

  1. Start fresh: Open a blank Colab notebook.

2. Add your data: Upload your data file.

3. Describe your goals: Describe what kind of analysis or prototype you want to build in the Gemini side panel (e.g., "Visualize trends," "Build and optimize prediction model", “Fill-in missing values”, “Select the best statistical technique”).

4. Watch the Data Science Agent get to work: Sit back and watch as the necessary code, import libraries, and analysis is generated in a working Colab notebook.

Data Science Agent automating analysis, from understanding the data to delivering insights in a working Colab notebook (Sequences shortened. Results for illustrative purposes. Data Science Agent may make mistakes.)

Data Science Agent benefits

  • Fully functional Colab notebooks: Not just code snippets, but complete, executable notebooks.

  • Modifiable solutions: Easily customize and extend the generated code to fit your specific needs.

  • Sharable results: Collaborate with teammates using standard Colab sharing features.

  • Time savings: Focus on deriving insights from your data instead of wrestling with setup and boilerplate code.

Our Data Science Agent has also landed in 4th place on the DABStep: Data Agent Benchmark for Multi-step Reasoning on HuggingFace, ahead of ReAct agents based on GPT 4.0, Deepseek, Claude 3.5 Haiku, Llama 3.3 70B.


Get started with Data Science Agent

Give it a try by simply uploading some data and outlining your data analysis objectives from the Gemini side panel. You can explore datasets on Kaggle or Data Commons, but here are some sample data and prompts to try:

  • Iris Species: try asking “Calculate and visualize the Pearson, Spearman, and Kendall correlations in this data”


We hope this transforms your data analysis workflow. We can’t wait to hear what you think, please join our Google Labs Discord community and the #data-science-agent channel to connect.