Imagen 3 已集成至 Gemini API

2025年2月6日

Ivan Solovyev Product Manager

开发者现在可以通过 Gemini API 访问 Google 最先进的图像生成模型 Imagen 3。该模型最初仅对付费用户开放，不久后也将向免费层级用户推出。

Imagen 3 擅长生成具有视觉吸引力、无伪影且风格多样的图像，从超现实主义图像到印象派风景，从抽象构图到动漫角色等应有尽有。改进后的提示遵循功能使得将创意转化为高质量图像变得更加容易。总体而言，Imagen 3 在各种基准测试中展现出了最先进的性能。通过 Gemini API 使用 Imagen 3 的价格为每张图像 0.03 美元，并且用户可以控制纵横比、生成选项数量等更多参数。

为了帮助打击虚假信息和错误归属，所有由 Imagen 3 生成的图像都包含一个不可见的数字 SynthID 水印，标识它们为 AI 生成的图像。

了解 Imagen 3 的实际效果

下方图库展示了 Imagen 3 在生成多种风格图像上的卓越表现。

Imagen 3 generated image of a group of people looking happy, natural light, 8k

提示：一群人看起来很开心，自然光，8k

Imagen 3 generated Hyperrealistic portrait of a person dressed in 1920s flapper fashion, vintage style, black and white photograph, elegant pose, 8k

提示：人物装扮为 20 世纪 20 年代时髦女郎风格，超现实主义肖像，复古风格，黑白照片，优雅的姿势，8k

Imagen 3 generated image of a close-up of a vintage watch with realistic and detailed mechanism

提示：设想一只复古手表的特写。生成描绘手表精细机械结构的写实图像

Imagen 3 generated image of an impressionistic landscape painting of a sunset over a field of sunflowers, vibrant colors, thick brushstrokes, inspired by Monet

提示：印象派风景画，描绘向日葵花田上的日落景象，色彩鲜艳，笔触厚重，灵感来自莫奈

Imagen 3 generated image of A surreal dreamscape featuring a giant tortoise with a lush forest growing on its back, floating through a starry sky, glowing mushrooms, bioluminescent plants, ethereal atmosphere

提示：超现实的梦境，巨大的乌龟背上长着郁郁葱葱的森林，漂浮在星空中，闪闪发光的蘑菇，自然发光的植物，飘逸的氛围

Imagen 3 generated lifestyle image of freshly roasted coffee beans spilling out of a burlap sack onto a rustic wooden table next to a up of coffee with 'Awaken Your Senses' written on the cup in cursive

提示：新鲜烘焙的咖啡豆从粗麻布袋中溢出到原木质地的桌上，旁边的咖啡杯中热气升腾，杯子上有花体字样“唤醒你的感官”，温暖迷人的氛围，晨光，产品摄影

Imagen 3 generated hyperrealistic portrait of a woman with piercing blue eyes, laughing, freckles, dramatic lighting, detailed skin texture, 8k

提示：超现实主义人像，有锐利蓝眼睛的女性，大笑，雀斑，对比强烈的光线，肌肤纹理细腻，8k

Imagen 3 generated panoramic view of a majestic mountain range at dawn

提示：黎明时分壮丽的山脉全景。

Imagen 3 generated scene from a game where the player needs to find a specific object by looking into drawers in a messy desk

提示：显示游戏中的场景，玩家需要在凌乱的桌子边翻找抽屉来找到特定的对象。

Imagen 3 generated painted cityscape in the style of Van Gogh

提示：以梵高风格绘制的城市景观，笔触旋转，色彩鲜艳。

在 Gemini API 中开始使用 Imagen 3

此 Python 代码段展示了如何使用 Gemini API 和 Imagen 3 生成图像。

from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
 
client = genai.Client(api_key='GEMINI_API_KEY')
 
response = client.models.generate_images(
    model='imagen-3.0-generate-002',
    prompt='a portrait of a sheepadoodle wearing cape',
    config=types.GenerateImagesConfig(
        number_of_images=1,
    )
)
for generated_image in response.generated_images:
  image = Image.open(BytesIO(generated_image.image.image_bytes))
  image.show()

Python

图像已生成

Imagen 3 generated portrait of a sheepadoodle wearing a cape

您可以在 Gemini API 开发者文档中探索更多关于提示技巧和图像风格的内容，有关评分、方法论及性能提升的更多详情，请参阅我们更新后的技术报告附录 D。

我们非常高兴能够迈出第一步，将我们的生成式媒体模型扩展至 Gemini API。我们还计划在不久的将来推出更多模型，以便开发者能够将生成式媒体与语言模型结合起来使用。