Imagen 3 arrives in the Gemini API

Date:

Share post:

Developers can now access Imagen 3, Google’s state-of-the-art image generation model, through the Gemini API. The model will be initially accessible to paid users, with a rollout to the free tier coming soon.

Imagen 3 excels in producing visually appealing, artifact-free images in a wide variety of styles from hyperrealistic images to impressionistic landscapes, abstract compositions to anime characters. Improved prompt following makes it easy to convert great ideas into high-quality images. Overall, Imagen 3 achieves state-of-the-art performance on the variety of benchmarks. Imagen 3 achieves this while being priced at $0.03 per image on the Gemini API, with control over aspect ratios, the number of options to generate, and more.

To help combat misinformation and misattribution, all images generated by Imagen 3 include a non-visible digital SynthID watermark, identifying them as AI-generated.


See Imagen 3 in Action

The gallery below highlights Imagen 3’s capabilities across a range of styles.

Get Started with Imagen 3 in the Gemini API

This Python code snippet demonstrates how to generate an image with Imagen 3 using the Gemini API.

from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO

client = genai.Client(api_key='GEMINI_API_KEY')

response = client.models.generate_images(
    model='imagen-3.0-generate-002',
    prompt='a portrait of a sheepadoodle wearing cape',
    config=types.GenerateImagesConfig(
        number_of_images=1,
    )
)
for generated_image in response.generated_images:
  image = Image.open(BytesIO(generated_image.image.image_bytes))
  image.show()

You can explore more prompting advice and image styles in the Gemini API developer docs, with further details available on scores, methodology, and performance improvement in Appendix D of our updated technical report.

We are excited to take the first step of expanding availability of our generative media models into the Gemini API and plan to make more available in the near future so that developers can bridge generative media and language models together.

Source link

spot_img

Related articles

Matrix Push C2 Uses Browser Notifications for Fileless, Cross-Platform Phishing Attacks

Bad actors are leveraging browser notifications as a vector for phishing attacks to distribute malicious links by means...

The New Framework Laptop 16 Has An Upgradable GPU!

A Big Change From The FrameWork Laptop 13 Ars Technica got their hands on the all new FrameWork Laptop...

Fragments Nov 19

I’ve been on the road in Europe for the last couple of weeks, and while I was there...

Logitech Promo Code: $25 Off This Holiday Season

A leader in almost everything tech and home-office related for over 40 years, Swiss-founded Logitech offers a vast...