Introducing TxGemma: Open models to improve therapeutics development

3月 25, 2025
Shekoofeh Azizi Staff Research Scientist

Developing a new therapeutic is risky, notoriously slow, and can cost billions of dollars. 90% of drug candidates fail beyond phase 1 trials. Today, we're excited to release TxGemma, a collection of open models designed to improve the efficiency of therapeutic development by leveraging the power of large language models.

Building on Google DeepMind's Gemma, a family of lightweight, state-of-the-art open models, TxGemma is specifically trained to understand and predict the properties of therapeutic entities throughout the entire discovery process, from identifying promising targets to helping predict clinical trial outcomes. This can potentially shorten the time from lab to bedside, and reduce the costs associated with traditional methods.


From Tx-LLM to TxGemma

Last October, we introduced Tx-LLM, a language model trained for a variety of therapeutic tasks related to drug development. After huge interest to use and fine-tune this model for therapeutic applications, we have developed its open successor at a practical scale: TxGemma, which we are releasing today for developers to adapt to their own therapeutic data and tasks.

TxGemma models, fine-tuned from Gemma 2 using 7 million training examples, are open models designed for prediction and conversational therapeutic data analysis. These models are available in three sizes: 2B, 9B and 27B. Each size includes a ‘predict’ version, specifically tailored for narrow tasks drawn from Therapeutic Data Commons, for example predicting if a molecule is toxic.

These tasks encompass:

  • classification (e.g., will this molecule cross the blood-brain barrier?)

  • regression (e.g., predicting a drug's binding affinity)

  • and generation (e.g., given the product of some reaction, generate the reactant set)

The largest TxGemma model (27B predict version) delivers strong performance. It's not only better than, or roughly equal to, our previous state-of-the-art generalist model (Tx-LLM) on almost every task, but it also rivals or beats many models that are specifically designed for single tasks. Specifically, it outperforms or has comparable performance to our previous model on 64 of 66 tasks (beating it on 45), and does the same against specialized models on 50 of the tasks (beating them on 26). See the TxGemma paper for detailed results.


Conversational AI for deeper insights

TxGemma also includes 9B and 27B ‘chat’ versions. These models have general instruction tuning data added to their training, enabling them to explain their reasoning, answer complex questions, and engage in multi-turn discussions. For example, a researcher could ask TxGemma-Chat why it predicted a particular molecule to be toxic and receive an explanation based on the molecule's structure. This conversational capability comes at a small cost to the raw performance on therapeutic tasks compared to TxGemma-Predict.


Extending TxGemma's capabilities through fine-tuning

As part of the release, we’re including a fine-tuning example Colab notebook that demonstrates how developers can adapt TxGemma to their own therapeutic data and tasks. This notebook uses the TrialBench dataset to show how to fine-tune TxGemma for predicting adverse events in clinical trials. Fine-tuning allows researchers to leverage their proprietary data to create models tailored to their unique research needs, possibly leading to even more accurate predictions that help researchers assess how safe or or effective a potential new therapy might be.


Orchestrating workflows for advanced therapeutic discovery with Agentic-Tx

Beyond single-step predictions, we’re demonstrating how TxGemma can be integrated into agentic systems to tackle more complex research problems. Standard language models often struggle with tasks requiring up-to-date external knowledge or multi-step reasoning. To address this, we've developed Agentic-Tx, a therapeutics-focused agentic system powered by Gemini 2.0 Pro. Agentic-Tx is equipped with 18 tools, including:

  • TxGemma as a tool for multi-step reasoning

  • General search tools from PubMed, Wikipedia and the web

  • Specific molecular tools

  • Gene and protein tools

Agentic-Tx achieves state-of-the-art results on reasoning-intensive chemistry and biology tasks from benchmarks including Humanity's Last Exam and ChemBench. We are including a Colab notebook with our release to demonstrate how Agentic-Tx can be used to orchestrate complex workflows and answer multi-step research questions.

Get started with TxGemma

You can access TxGemma on both Vertex AI Model Garden and Hugging Face today. We encourage you to explore the models, try out the inference, fine-tuning, and agent Colab notebooks, and share your feedback! As an open model, TxGemma is designed to be further improved – researchers can fine-tune it with their data for specific therapeutic development use-cases. We're excited to see how the community will use TxGemma to accelerate therapeutic discovery.


Acknowledgements

Key contributors to this project include: Eric Wang, Samuel Schmidgall, Fan Zhang, Paul F. Jaeger, Rory Pilgrim and Tiffany Chen. We also thank Shravya Shetty, Dale Webster, Avinatan Hassidim, Yossi Matias, Yun Liu, Rachelle Sico, Phoebe Kirk, Fereshteh Mahvar, Can "John" Kirmizi, Fayaz Jamil, Tim Thelin, Glenn Cameron, Victor Cotruta, David Fleet, Jon Shlens, Omar Sanseviero, Joe Fernandez, and Joëlle Barral, for their feedback and support throughout this project.