Unlocking Multi-Spectral Data with Gemini

4 DE SEPTIEMBRE DE 2025
Ganesh Mallya Software Engineer
Anelia Angelova Research Scientist

As developers, we’re used to working with images. We build apps that recognize faces, categorize products, and generate art. But most of the time, we’re living in an RGB world—Red, Green, and Blue. It’s how our eyes and cameras see.

But what if you could give your application superhuman vision? What if it could see in wavelengths invisible to the human eye to understand the world in a fundamentally new way?

That’s the power of multi-spectral imagery, and thanks to the native multimodal capabilities of Google's Gemini models, it's more accessible than ever. You don't need a custom-trained, specialized model anymore. You can start analyzing complex satellite data, right out of the box.


What is Multi-Spectral Imagery, Anyway?

Think of a standard digital photo. Each pixel has three values: R, G, and B. A multi-spectral sensor is like a super-powered camera. Instead of just three bands, it captures data across many different bands of the electromagnetic spectrum, including those we can't see, like Near-Infrared (NIR) and Short-Wave Infrared (SWIR).


Why is this a game-changer?

  • Vegetation Health: Healthy plants reflect a lot of NIR light. By looking at the NIR band, you can assess crop health or monitor deforestation far more accurately than with a simple green photo.

  • Water Detection: Water absorbs infrared light, making it easy to distinguish from land, map floodplains, or even analyze water quality.

  • Burn Scars: SWIR bands are excellent at piercing through smoke and identifying recently burned areas after a wildfire.

  • Material Identification: Different minerals and man-made materials have unique spectral "fingerprints," allowing you to identify them from space.


Historically, using this data required specialized tools, complex data processing pipelines, and custom machine learning models. Gemini changes the game by letting you leverage its powerful reasoning engine on this rich data with a surprisingly simple technique.


Mapping Invisible Light to Visible Colors

Gemini, like other large multimodal models, is pre-trained on a vast dataset of standard images (RGB) and text. It understands what a "red car" or "green forest" is. The key to making it understand multi-spectral data is to map the invisible bands we care about into the R, G, and B channels that Gemini already understands.

We create a "false-color composite" image. We're not trying to make it look natural; we're encoding scientific data into a format the model can process.

Here’s the simple, three-step process:

  1. Select Your Bands: Choose three spectral bands that are important for your specific problem.


2. Normalize and Map: Scale the data from each band to a standard 0-255 integer range and assign them to the Red, Green, and Blue channels of a new image.


3. Prompt with Context: Pass this newly created image to Gemini and, critically, tell it in the prompt what the colors represent.


This last step is the magic. You are essentially teaching the model, in real-time, how to interpret your custom image.


The power of Gemini

This approach is a game-changer for developers, dramatically lowering the barrier to entry for analyzing complex satellite data. It enables the rapid prototyping of new applications in hours, not weeks, without requiring deep expertise in remote sensing. Thanks to Gemini's powerful in-context learning, developers can dynamically instruct the model on how to interpret different spectral data for various tasks—from agricultural monitoring to urban planning—simply by providing a clear prompt alongside the custom image.

The era of AI-powered environmental monitoring, precision agriculture, and disaster response is here, and with Gemini, the tools are directly in your hands. So grab some public satellite data from sources like NASA's Earthdata, Copernicus Open Access Hub, or Google Earth Engine, and start teaching your app to see the world in a whole new light.


Some examples: Gemini 2.5 in action

Gemini 2.5 for land cover classification

The universal capabilities of Gemini 2.5 model allows for easy querying for remote sensing applications out of the box, for example, Gemini 2.5 is quite successful in understanding these inputs and responding appropriately as ‘Permanent crop’, ‘River’ and ‘Industrial’, respectively.

Gemini 2.5 with multi-spectral inputs

In some cases, though, in some challenging scenarios/situations, the model might not have enough information in the RGB image alone. When adding Multi-Spectral inputs and advanced prompts, entirely in Zero-Shot, we see improved performance.

For example, this is an image of a `River’, where initially, the model misclassified as a Forest.

After introducing multi-spectral inputs and a detailed prompt, as shown here <TODO: Add the arxiv>, the model correctly recognizes it as a River, and the reasoning trace shows that the model has utilized the multi-spectral inputs, particularly the NDWI and images to infer this is water.

In another example,which is an image of a `Forest’, the model initially classifies it as a `SeaLake’, reasoning about the blue/green areas.

When including the multi-spectral inputs, we see the model now easily classifies this as a `Forest` and the reasoning trace shows that it leverages the additional inputs quite significantly.

As seen from these examples, it is clear that the additional inputs are important for making better decisions. Furthermore, since the model does not need to be changed, we can add other types of inputs as shown below.


Gemini 2.5 with Multi-Spectral and Digital Elevation Maps inputs

Can we add more types of inputs? We here show that Gemini’s powerful context understanding can be leveraged for understanding more types of input sensors for Remote Sensing. Apart from multi-spectral imagery, readily available from Sentinel-2, we include Digital Elevation information obtained from the Copernicus Digital Elevation Model from the TanDEM-X satellite mission provides synthetic aperture radar (SAR) interferometry data which can be used for elevation information. For example for the Forests Types (ForTy) dataset <TODO Verify Arxiv/paper link>, we see that both types of sensors, i.e. Multi-Spectral and , are helpful for improving performance.