docs(providers): add generation setup pages

This commit is contained in:
Peter Steinberger 2026-04-05 23:20:52 +01:00
parent 1a3eb38aaf
commit f30c087fdf
No known key found for this signature in database
11 changed files with 406 additions and 16 deletions

73
docs/providers/alibaba.md Normal file
View File

@ -0,0 +1,73 @@
---
title: "Alibaba Model Studio"
summary: "Alibaba Model Studio Wan video generation in OpenClaw"
read_when:
- You want to use Alibaba Wan video generation in OpenClaw
- You need Model Studio or DashScope API key setup for video generation
---
# Alibaba Model Studio
OpenClaw ships a bundled `alibaba` video-generation provider for Wan models on
Alibaba Model Studio / DashScope.
- Provider: `alibaba`
- Preferred auth: `MODELSTUDIO_API_KEY`
- Also accepted: `DASHSCOPE_API_KEY`, `QWEN_API_KEY`
- API: DashScope / Model Studio async video generation
## Quick start
1. Set an API key:
```bash
openclaw onboard --auth-choice qwen-standard-api-key
```
2. Set a default video model:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "alibaba/wan2.6-t2v",
},
},
},
}
```
## Built-in Wan models
The bundled `alibaba` provider currently registers:
- `alibaba/wan2.6-t2v`
- `alibaba/wan2.6-i2v`
- `alibaba/wan2.6-r2v`
- `alibaba/wan2.6-r2v-flash`
- `alibaba/wan2.7-r2v`
## Current limits
- Up to **1** output video per request
- Up to **1** input image
- Up to **4** input videos
- Up to **10 seconds** duration
- Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark`
- Reference image/video mode currently requires **remote http(s) URLs**
## Relationship to Qwen
The bundled `qwen` provider also uses Alibaba-hosted DashScope endpoints for
Wan video generation. Use:
- `qwen/...` when you want the canonical Qwen provider surface
- `alibaba/...` when you want the direct vendor-owned Wan video surface
## Related
- [Video Generation](/tools/video-generation)
- [Qwen](/providers/qwen)
- [Qwen / Model Studio](/providers/qwen_modelstudio)
- [Configuration Reference](/gateway/configuration-reference#agent-defaults)

90
docs/providers/fal.md Normal file
View File

@ -0,0 +1,90 @@
---
title: "fal"
summary: "fal image and video generation setup in OpenClaw"
read_when:
- You want to use fal image generation in OpenClaw
- You need the FAL_KEY auth flow
- You want fal defaults for image_generate or video_generate
---
# fal
OpenClaw ships a bundled `fal` provider for hosted image and video generation.
- Provider: `fal`
- Auth: `FAL_KEY`
- API: fal model endpoints
## Quick start
1. Set the API key:
```bash
openclaw onboard --auth-choice fal-api-key
```
2. Set a default image model:
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "fal/fal-ai/flux/dev",
},
},
},
}
```
## Image generation
The bundled `fal` image-generation provider defaults to
`fal/fal-ai/flux/dev`.
- Generate: up to 4 images per request
- Edit mode: enabled, 1 reference image
- Supports `size`, `aspectRatio`, and `resolution`
- Current edit caveat: the fal image edit endpoint does **not** support
`aspectRatio` overrides
To use fal as the default image provider:
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "fal/fal-ai/flux/dev",
},
},
},
}
```
## Video generation
The bundled `fal` video-generation provider defaults to
`fal/fal-ai/minimax/video-01-live`.
- Modes: text-to-video and single-image reference flows
To use fal as the default video provider:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "fal/fal-ai/minimax/video-01-live",
},
},
},
}
```
## Related
- [Image Generation](/tools/image-generation)
- [Video Generation](/tools/video-generation)
- [Configuration Reference](/gateway/configuration-reference#agent-defaults)

View File

@ -100,6 +100,50 @@ The bundled `google` image-generation provider defaults to
Image generation, media understanding, and Gemini Grounding all stay on the
`google` provider id.
To use Google as the default image provider:
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "google/gemini-3.1-flash-image-preview",
},
},
},
}
```
See [Image Generation](/tools/image-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Video generation
The bundled `google` plugin also registers video generation through the shared
`video_generate` tool.
- Default video model: `google/veo-3.1-fast-generate-preview`
- Modes: text-to-video, image-to-video, and single-video reference flows
- Supports `aspectRatio`, `resolution`, and `audio`
- Current duration clamp: **4 to 8 seconds**
To use Google as the default video provider:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "google/veo-3.1-fast-generate-preview",
},
},
},
}
```
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Environment note
If the Gateway runs as a daemon (launchd/systemd), make sure `GEMINI_API_KEY`

View File

@ -63,6 +63,35 @@ The built-in bundled MiniMax text catalog itself stays text-only metadata until
that explicit provider config exists. Image understanding is exposed separately
through the plugin-owned `MiniMax-VL-01` media provider.
See [Image Generation](/tools/image-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Video generation
The bundled `minimax` plugin also registers video generation through the shared
`video_generate` tool.
- Default video model: `minimax/MiniMax-Hailuo-2.3`
- Modes: text-to-video and single-image reference flows
- Supports `aspectRatio` and `resolution`
To use MiniMax as the default video provider:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "minimax/MiniMax-Hailuo-2.3",
},
},
},
}
```
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Image understanding
The MiniMax plugin registers image understanding separately from the text

View File

@ -108,6 +108,63 @@ OpenClaw does **not** expose `openai/gpt-5.3-codex-spark` on the direct OpenAI
API path. `pi-ai` still ships a built-in row for that model, but live OpenAI API
requests currently reject it. Spark is treated as Codex-only in OpenClaw.
## Image generation
The bundled `openai` plugin also registers image generation through the shared
`image_generate` tool.
- Default image model: `openai/gpt-image-1`
- Generate: up to 4 images per request
- Edit mode: enabled, up to 5 reference images
- Supports `size`
- Current OpenAI-specific caveat: OpenClaw does not forward `aspectRatio` or
`resolution` overrides to the OpenAI Images API today
To use OpenAI as the default image provider:
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "openai/gpt-image-1",
},
},
},
}
```
See [Image Generation](/tools/image-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Video generation
The bundled `openai` plugin also registers video generation through the shared
`video_generate` tool.
- Default video model: `openai/sora-2`
- Modes: text-to-video, image-to-video, and single-video reference/edit flows
- Current limits: 1 image or 1 video reference input
- Current OpenAI-specific caveat: OpenClaw does not forward `aspectRatio` or
`resolution` overrides to the native OpenAI video API today
To use OpenAI as the default video provider:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "openai/sora-2",
},
},
},
}
```
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Option B: OpenAI Code (Codex) subscription
**Best for:** using ChatGPT/Codex subscription access instead of an API key.

View File

@ -1,10 +1,9 @@
---
summary: "Use Qwen Cloud via OpenClaw's bundled qwen provider"
read_when:
- You want to use Qwen with OpenClaw
- You previously used Qwen OAuth
title: "Qwen"
- You want to use Qwen with OpenClaw
- You previously used Qwen OAuth
title: "Qwen"
---
# Qwen
@ -127,5 +126,8 @@ Current bundled Qwen video-generation limits:
file paths are rejected up front because the DashScope video endpoint does not
accept uploaded local buffers for those references.
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
See [Qwen / Model Studio](/providers/qwen_modelstudio) for endpoint-level detail
and compatibility notes.

View File

@ -1,11 +1,10 @@
---
title: "Qwen / Model Studio"
summary: "Endpoint detail for the bundled qwen provider and its legacy modelstudio compatibility surface"
read_when:
- You want endpoint-level detail for Qwen Cloud / Alibaba DashScope
- You need the env var compatibility story for the qwen provider
- You want to use the Standard (pay-as-you-go) or Coding Plan endpoint
- You want endpoint-level detail for Qwen Cloud / Alibaba DashScope
- You need the env var compatibility story for the qwen provider
- You want to use the Standard (pay-as-you-go) or Coding Plan endpoint
---
# Qwen / Model Studio (Alibaba Cloud)
@ -135,3 +134,34 @@ endpoint/key pair.
If the Gateway runs as a daemon (launchd/systemd), make sure
`QWEN_API_KEY` is available to that process (for example, in
`~/.openclaw/.env` or via `env.shellEnv`).
## Wan video generation
The Standard DashScope surface also backs the bundled Wan video-generation
providers.
You can address the same Wan family through either prefix:
- canonical Qwen refs:
- `qwen/wan2.6-t2v`
- `qwen/wan2.6-i2v`
- `qwen/wan2.6-r2v`
- `qwen/wan2.6-r2v-flash`
- `qwen/wan2.7-r2v`
- direct Alibaba refs:
- `alibaba/wan2.6-t2v`
- `alibaba/wan2.6-i2v`
- `alibaba/wan2.6-r2v`
- `alibaba/wan2.6-r2v-flash`
- `alibaba/wan2.7-r2v`
All Wan reference modes currently require **remote http(s) URLs** for image or
video references. Local file paths are rejected before upload because the
DashScope video endpoint does not accept local-buffer reference assets for
those modes.
## Related
- [Qwen](/providers/qwen)
- [Alibaba Model Studio](/providers/alibaba)
- [Video Generation](/tools/video-generation)

View File

@ -68,3 +68,29 @@ OpenClaw currently ships this bundled Together catalog:
| `together/moonshotai/Kimi-K2-Instruct-0905` | Kimi K2-Instruct 0905 | text | 262,144 | Secondary Kimi text model |
The onboarding preset sets `together/moonshotai/Kimi-K2.5` as the default model.
## Video generation
The bundled `together` plugin also registers video generation through the
shared `video_generate` tool.
- Default video model: `together/Wan-AI/Wan2.2-T2V-A14B`
- Modes: text-to-video and single-image reference flows
- Supports `aspectRatio` and `resolution`
To use Together as the default video provider:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "together/Wan-AI/Wan2.2-T2V-A14B",
},
},
},
}
```
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.

View File

@ -75,6 +75,34 @@ The bundled `grok` web-search provider uses `XAI_API_KEY` too:
openclaw config set tools.web.search.provider grok
```
## Video generation
The bundled `xai` plugin also registers video generation through the shared
`video_generate` tool.
- Default video model: `xai/grok-imagine-video`
- Modes: text-to-video, image-to-video, and remote video edit/extend flows
- Supports `aspectRatio` and `resolution`
- Current limit: local video buffers are not accepted; use remote `http(s)`
URLs for video-reference/edit inputs
To use xAI as the default video provider:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "xai/grok-imagine-video",
},
},
},
}
```
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Known limits
- Auth is API-key only today. There is no xAI OAuth/device-code flow in OpenClaw yet.

View File

@ -24,7 +24,9 @@ The tool only appears when at least one image generation provider is available.
{
agents: {
defaults: {
imageGenerationModel: "openai/gpt-image-1",
imageGenerationModel: {
primary: "openai/gpt-image-1",
},
},
},
}
@ -74,10 +76,6 @@ Not all providers support all parameters. The tool passes what each provider sup
{
agents: {
defaults: {
// String form: primary model only
imageGenerationModel: "google/gemini-3.1-flash-image-preview",
// Object form: primary + ordered fallbacks
imageGenerationModel: {
primary: "openai/gpt-image-1",
fallbacks: ["google/gemini-3.1-flash-image-preview", "fal/fal-ai/flux/dev"],
@ -135,5 +133,9 @@ MiniMax image generation is available through both bundled MiniMax auth paths:
## Related
- [Tools Overview](/tools) — all available agent tools
- [fal](/providers/fal) — fal image and video provider setup
- [Google (Gemini)](/providers/google) — Gemini image provider setup
- [MiniMax](/providers/minimax) — MiniMax image provider setup
- [OpenAI](/providers/openai) — OpenAI Images provider setup
- [Configuration Reference](/gateway/configuration-reference#agent-defaults) — `imageGenerationModel` config
- [Models](/concepts/models) — model configuration and failover

View File

@ -24,7 +24,9 @@ The tool only appears when at least one video-generation provider is available.
{
agents: {
defaults: {
videoGenerationModel: "qwen/wan2.6-t2v",
videoGenerationModel: {
primary: "qwen/wan2.6-t2v",
},
},
},
}
@ -121,6 +123,13 @@ The bundled Qwen provider supports text-to-video plus image/video reference mode
## Related
- [Tools Overview](/tools) — all available agent tools
- [Alibaba Model Studio](/providers/alibaba) — direct Wan provider setup
- [Google (Gemini)](/providers/google) — Veo provider setup
- [MiniMax](/providers/minimax) — Hailuo provider setup
- [OpenAI](/providers/openai) — Sora provider setup
- [Qwen](/providers/qwen) — Qwen-specific setup and limits
- [Qwen / Model Studio](/providers/qwen_modelstudio) — endpoint-level DashScope detail
- [Together AI](/providers/together) — Together Wan provider setup
- [xAI](/providers/xai) — Grok video provider setup
- [Configuration Reference](/gateway/configuration-reference#agent-defaults) — `videoGenerationModel` config
- [Models](/concepts/models) — model configuration and failover