Unlock deeper insights with the new Python client library for Data Commons

JUNE 26, 2025
Kara Moscoe Technical Writer

Data is the bedrock of progress across nearly every field. It serves as the raw material from which profound insights are forged, enabling us to precisely measure current realities, identify critical trends, and possibly predict future outcomes.

At Google, our mission with Data Commons is to organize the world's publicly available statistical data, making it more accessible and useful for everyone. It's an open-source knowledge graph that unifies a vast array of public data from diverse sources, simplifying access and comprehension for developers, researchers, and data analysts alike. Along with the datacommons.org website, Google Search uses Data Commons to answer queries like What is the population of San Francisco?, with the top graph generated by Data Commons.

Today, we're announcing the general availability of the new Python client library for the Data Commons based on the V2 REST API. This new Python library dramatically enhances how data developers can leverage Data Commons.


Real-world impact: partnering with ONE.org

This milestone was significantly shaped by the vision and substantial contributions of our partner The ONE Campaign, a global organization working to create the investments needed for economic opportunities and healthier lives in Africa. We built Data Commons as an open-source platform precisely to encourage community contributions and enable innovative uses, and this partnership with The ONE Campaign perfectly exemplifies that goal. ONE advocated for, proposed the design and coded the client library to make Data Commons' rich insights available to data scientists and analysts who want to leverage the rich ecosystem of Python analytical tools and libraries.


Support for custom Data Commons instances

The Data Commons platform also allows organizations, like the United Nations or ONE, to host their own Data Commons instances. These custom instances enable the seamless integration of proprietary datasets with the foundational Data Commons knowledge graph. Organizations leverage the Data Commons data framework and tools while maintaining full control over their data and resources.

One of the most impactful additions in the V2 library is robust support for custom instances. This means you can now use the Python library to programmatically query any public or private instance—whether hosted locally, within your organization or on the Google Cloud Platform.


Powerful new features

The Python library makes it very easy to perform common queries against Data Commons data, such as:

  • Exploring the structure of the knowledge graph

  • Retrieving data for any of the 200,000+ statistical variables from over 200 datasets in domains such as demographics, economy, education, energy, environment, health, and housing

  • Easily mapping entities from other datasets to entities in Data Commons


V2 of the client library offers many technical improvements over the V1 library, including:

  • Pandas dataframe APIs are supported as an integral module, with a single installation package, allowing seamless use with other API endpoints in the same client

  • Several new convenience methods for common data queries

  • API key management and other stateful operations built in to the client class

  • Integration with the Pydantic libraries for improved type safety, validation and serialization

  • Support for multiple response formats, including JSON and Python dictionaries and lists
variable = "sdg/SI_POV_DAY1"
variable_name = "Proportion of population below international poverty line"

df = client.observations_dataframe(variable_dcids=variable, date="all", parent_entity="Earth", entity_type="Continent")

df = df.pivot(index="date", columns="entity_name", values="value")

ax = df.plot(kind="line")
ax.set_xlabel("Year")
ax.set_ylabel("%")
ax.set_title(variable_name)
ax.legend()
ax.plot()
Python
Graph showing proportion of population below international poverty line across continental regions

Getting started

To get started with the Data Commons Python library, you can install the package directly from PyPI. We've also provided comprehensive resources to help you dive in, including reference documentation and online tutorials available as Google Colab notebooks.

For those currently using the V1 Python API, we strongly recommend upgrading to the new V2 Python library. The V1 API is scheduled for deprecation, and adopting the new library ensures you'll have access to the latest features and continued support.


Open community

This library is a testament to the power of open-source collaboration. The open-source code is available on GitHub, and we welcome contributions from the community under the Google Contributor License Agreement.