openclaw/docs/tools/image-generation.md

4.5 KiB
Raw Blame History

summary read_when title
Generate and edit images using configured providers (OpenAI, Google Gemini, fal, MiniMax)
Generating images via the agent
Configuring image generation providers and models
Understanding the image_generate tool parameters
Image Generation

Image Generation

The image_generate tool lets the agent create and edit images using your configured providers. Generated images are delivered automatically as media attachments in the agent's reply.

The tool only appears when at least one image generation provider is available. If you don't see `image_generate` in your agent's tools, configure `agents.defaults.imageGenerationModel` or set up a provider API key.

Quick start

  1. Set an API key for at least one provider (for example OPENAI_API_KEY or GEMINI_API_KEY).
  2. Optionally set your preferred model:
{
  agents: {
    defaults: {
      imageGenerationModel: "openai/gpt-image-1",
    },
  },
}
  1. Ask the agent: "Generate an image of a friendly lobster mascot."

The agent calls image_generate automatically. No tool allow-listing needed — it's enabled by default when a provider is available.

Supported providers

Provider Default model Edit support API key
OpenAI gpt-image-1 No OPENAI_API_KEY
Google gemini-3.1-flash-image-preview Yes GEMINI_API_KEY or GOOGLE_API_KEY
fal fal-ai/flux/dev Yes FAL_KEY
MiniMax image-01 Yes (subject reference) MINIMAX_API_KEY

Use action: "list" to inspect available providers and models at runtime:

/tool image_generate action=list

Tool parameters

Parameter Type Description
prompt string Image generation prompt (required for action: "generate")
action string "generate" (default) or "list" to inspect providers
model string Provider/model override, e.g. openai/gpt-image-1
image string Single reference image path or URL for edit mode
images string[] Multiple reference images for edit mode (up to 5)
size string Size hint: 1024x1024, 1536x1024, 1024x1536, 1024x1792, 1792x1024
aspectRatio string Aspect ratio: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
resolution string Resolution hint: 1K, 2K, or 4K
count number Number of images to generate (14)
filename string Output filename hint

Not all providers support all parameters. The tool passes what each provider supports and ignores the rest.

Configuration

Model selection

{
  agents: {
    defaults: {
      // String form: primary model only
      imageGenerationModel: "google/gemini-3-pro-image-preview",

      // Object form: primary + ordered fallbacks
      imageGenerationModel: {
        primary: "openai/gpt-image-1",
        fallbacks: ["google/gemini-3.1-flash-image-preview", "fal/fal-ai/flux/dev"],
      },
    },
  },
}

Provider selection order

When generating an image, OpenClaw tries providers in this order:

  1. model parameter from the tool call (if the agent specifies one)
  2. imageGenerationModel.primary from config
  3. imageGenerationModel.fallbacks in order
  4. Auto-detection — queries all registered providers for defaults, preferring: configured primary provider, then OpenAI, then Google, then others

If a provider fails (auth error, rate limit, etc.), the next candidate is tried automatically. If all fail, the error includes details from each attempt.

Image editing

Google, fal, and MiniMax support editing reference images. Pass a reference image path or URL:

"Generate a watercolor version of this photo" + image: "/path/to/photo.jpg"

Google supports up to 5 reference images via the images parameter. fal and MiniMax support 1.

Provider capabilities

Capability OpenAI Google fal MiniMax
Generate Yes (up to 4) Yes (up to 4) Yes (up to 4) Yes (up to 9)
Edit/reference No Yes (up to 5 images) Yes (1 image) Yes (1 image, subject ref)
Size control Yes Yes Yes No
Aspect ratio No Yes Yes (generate only) Yes
Resolution (1K/2K/4K) No Yes Yes No