docs: document music generation async flow

This commit is contained in:
Peter Steinberger 2026-04-06 01:46:25 +01:00
parent 3027f0dde5
commit f6dbcf4cda
No known key found for this signature in database
13 changed files with 253 additions and 46 deletions

View File

@ -30,6 +30,7 @@ Related:
falls back to `agents.defaults.imageModel`, then the resolved session/default
model.
- `agents.defaults.imageGenerationModel` is used by the shared image-generation capability. If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order. If you set a specific provider/model, also configure that provider's auth/API key.
- `agents.defaults.musicGenerationModel` is used by the shared music-generation capability. If omitted, `music_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered music-generation providers in provider-id order. If you set a specific provider/model, also configure that provider's auth/API key.
- `agents.defaults.videoGenerationModel` is used by the shared video-generation capability. If omitted, `video_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered video-generation providers in provider-id order. If you set a specific provider/model, also configure that provider's auth/API key.
- Per-agent defaults can override `agents.defaults.model` via `agents.list[].model` plus bindings (see [/concepts/multi-agent](/concepts/multi-agent)).
@ -253,5 +254,6 @@ This applies whenever OpenClaw regenerates `models.json`, including command-driv
- [Model Providers](/concepts/model-providers) — provider routing and auth
- [Model Failover](/concepts/model-failover) — fallback chains
- [Image Generation](/tools/image-generation) — image model configuration
- [Music Generation](/tools/music-generation) — music model configuration
- [Video Generation](/tools/video-generation) — video model configuration
- [Configuration Reference](/gateway/configuration-reference#agent-defaults) — model config keys

View File

@ -1026,6 +1026,11 @@ Time format in system prompt. Default: `auto` (OS preference).
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, or `openai/gpt-image-1` for OpenAI Images.
- If you select a provider/model directly, configure the matching provider auth/API key too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` for `openai/*`, `FAL_KEY` for `fal/*`).
- If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order.
- `musicGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
- Used by the shared music-generation capability and the built-in `music_generate` tool.
- Typical values: `google/lyria-3-clip-preview`, `google/lyria-3-pro-preview`, or `minimax/music-2.5+`.
- If omitted, `music_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered music-generation providers in provider-id order.
- If you select a provider/model directly, configure the matching provider auth/API key too.
- `videoGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
- Used by the shared video-generation capability and the built-in `video_generate` tool.
- Typical values: `qwen/wan2.6-t2v`, `qwen/wan2.6-i2v`, `qwen/wan2.6-r2v`, `qwen/wan2.6-r2v-flash`, or `qwen/wan2.7-r2v`.

View File

@ -35,6 +35,7 @@ native OpenClaw plugin registers against one or more capability types:
| Realtime voice | `api.registerRealtimeVoiceProvider(...)` | `openai` |
| Media understanding | `api.registerMediaUnderstandingProvider(...)` | `openai`, `google` |
| Image generation | `api.registerImageGenerationProvider(...)` | `openai`, `google`, `fal`, `minimax` |
| Music generation | `api.registerMusicGenerationProvider(...)` | `google`, `minimax` |
| Video generation | `api.registerVideoGenerationProvider(...)` | `qwen` |
| Web fetch | `api.registerWebFetchProvider(...)` | `firecrawl` |
| Web search | `api.registerWebSearchProvider(...)` | `google` |

View File

@ -157,6 +157,7 @@ A single plugin can register any number of capabilities via the `api` object:
| Realtime voice | `api.registerRealtimeVoiceProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
| Media understanding | `api.registerMediaUnderstandingProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
| Image generation | `api.registerImageGenerationProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
| Music generation | `api.registerMusicGenerationProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
| Video generation | `api.registerVideoGenerationProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
| Web fetch | `api.registerWebFetchProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
| Web search | `api.registerWebSearchProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |

View File

@ -128,26 +128,26 @@ Those belong in your plugin code and `package.json`.
## Top-level field reference
| Field | Required | Type | What it means |
| ----------------------------------- | -------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `id` | Yes | `string` | Canonical plugin id. This is the id used in `plugins.entries.<id>`. |
| `configSchema` | Yes | `object` | Inline JSON Schema for this plugin's config. |
| `enabledByDefault` | No | `true` | Marks a bundled plugin as enabled by default. Omit it, or set any non-`true` value, to leave the plugin disabled by default. |
| `legacyPluginIds` | No | `string[]` | Legacy ids that normalize to this canonical plugin id. |
| `autoEnableWhenConfiguredProviders` | No | `string[]` | Provider ids that should auto-enable this plugin when auth, config, or model refs mention them. |
| `kind` | No | `"memory"` \| `"context-engine"` | Declares an exclusive plugin kind used by `plugins.slots.*`. |
| `channels` | No | `string[]` | Channel ids owned by this plugin. Used for discovery and config validation. |
| `providers` | No | `string[]` | Provider ids owned by this plugin. |
| `modelSupport` | No | `object` | Manifest-owned shorthand model-family metadata used to auto-load the plugin before runtime. |
| `providerAuthEnvVars` | No | `Record<string, string[]>` | Cheap provider-auth env metadata that OpenClaw can inspect without loading plugin code. |
| `providerAuthChoices` | No | `object[]` | Cheap auth-choice metadata for onboarding pickers, preferred-provider resolution, and simple CLI flag wiring. |
| `contracts` | No | `object` | Static bundled capability snapshot for speech, realtime transcription, realtime voice, media-understanding, image-generation, video-generation, web-fetch, web search, and tool ownership. |
| `channelConfigs` | No | `Record<string, object>` | Manifest-owned channel config metadata merged into discovery and validation surfaces before runtime loads. |
| `skills` | No | `string[]` | Skill directories to load, relative to the plugin root. |
| `name` | No | `string` | Human-readable plugin name. |
| `description` | No | `string` | Short summary shown in plugin surfaces. |
| `version` | No | `string` | Informational plugin version. |
| `uiHints` | No | `Record<string, object>` | UI labels, placeholders, and sensitivity hints for config fields. |
| Field | Required | Type | What it means |
| ----------------------------------- | -------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `id` | Yes | `string` | Canonical plugin id. This is the id used in `plugins.entries.<id>`. |
| `configSchema` | Yes | `object` | Inline JSON Schema for this plugin's config. |
| `enabledByDefault` | No | `true` | Marks a bundled plugin as enabled by default. Omit it, or set any non-`true` value, to leave the plugin disabled by default. |
| `legacyPluginIds` | No | `string[]` | Legacy ids that normalize to this canonical plugin id. |
| `autoEnableWhenConfiguredProviders` | No | `string[]` | Provider ids that should auto-enable this plugin when auth, config, or model refs mention them. |
| `kind` | No | `"memory"` \| `"context-engine"` | Declares an exclusive plugin kind used by `plugins.slots.*`. |
| `channels` | No | `string[]` | Channel ids owned by this plugin. Used for discovery and config validation. |
| `providers` | No | `string[]` | Provider ids owned by this plugin. |
| `modelSupport` | No | `object` | Manifest-owned shorthand model-family metadata used to auto-load the plugin before runtime. |
| `providerAuthEnvVars` | No | `Record<string, string[]>` | Cheap provider-auth env metadata that OpenClaw can inspect without loading plugin code. |
| `providerAuthChoices` | No | `object[]` | Cheap auth-choice metadata for onboarding pickers, preferred-provider resolution, and simple CLI flag wiring. |
| `contracts` | No | `object` | Static bundled capability snapshot for speech, realtime transcription, realtime voice, media-understanding, image-generation, music-generation, video-generation, web-fetch, web search, and tool ownership. |
| `channelConfigs` | No | `Record<string, object>` | Manifest-owned channel config metadata merged into discovery and validation surfaces before runtime loads. |
| `skills` | No | `string[]` | Skill directories to load, relative to the plugin root. |
| `name` | No | `string` | Human-readable plugin name. |
| `description` | No | `string` | Short summary shown in plugin surfaces. |
| `version` | No | `string` | Informational plugin version. |
| `uiHints` | No | `Record<string, object>` | UI labels, placeholders, and sensitivity hints for config fields. |
## providerAuthChoices reference

View File

@ -263,6 +263,8 @@ Current bundled provider examples:
| `plugin-sdk/realtime-transcription` | Realtime transcription helpers | Provider types and registry helpers |
| `plugin-sdk/realtime-voice` | Realtime voice helpers | Provider types and registry helpers |
| `plugin-sdk/image-generation-core` | Shared image-generation core | Image-generation types, failover, auth, and registry helpers |
| `plugin-sdk/music-generation` | Music-generation helpers | Music-generation provider/request/result types |
| `plugin-sdk/music-generation-core` | Shared music-generation core | Music-generation types, failover helpers, provider lookup, and model-ref parsing |
| `plugin-sdk/video-generation` | Video-generation helpers | Video-generation provider/request/result types |
| `plugin-sdk/video-generation-core` | Shared video-generation core | Video-generation types, failover helpers, provider lookup, and model-ref parsing |
| `plugin-sdk/interactive-runtime` | Interactive reply helpers | Interactive reply payload normalization/reduction |

View File

@ -232,6 +232,8 @@ explicitly promotes one as public.
| `plugin-sdk/realtime-voice` | Realtime voice provider types and registry helpers |
| `plugin-sdk/image-generation` | Image generation provider types |
| `plugin-sdk/image-generation-core` | Shared image-generation types, failover, auth, and registry helpers |
| `plugin-sdk/music-generation` | Music generation provider/request/result types |
| `plugin-sdk/music-generation-core` | Shared music-generation types, failover helpers, provider lookup, and model-ref parsing |
| `plugin-sdk/video-generation` | Video generation provider/request/result types |
| `plugin-sdk/video-generation-core` | Shared video-generation types, failover helpers, provider lookup, and model-ref parsing |
| `plugin-sdk/webhook-targets` | Webhook target registry and route-install helpers |
@ -288,6 +290,7 @@ methods:
| `api.registerRealtimeVoiceProvider(...)` | Duplex realtime voice sessions |
| `api.registerMediaUnderstandingProvider(...)` | Image/audio/video analysis |
| `api.registerImageGenerationProvider(...)` | Image generation |
| `api.registerMusicGenerationProvider(...)` | Music generation |
| `api.registerVideoGenerationProvider(...)` | Video generation |
| `api.registerWebFetchProvider(...)` | Web fetch / scrape provider |
| `api.registerWebSearchProvider(...)` | Web search |

View File

@ -51,6 +51,7 @@ openclaw onboard --non-interactive \
| ---------------------- | ----------------- |
| Chat completions | Yes |
| Image generation | Yes |
| Music generation | Yes |
| Image understanding | Yes |
| Audio transcription | Yes |
| Video understanding | Yes |
@ -144,6 +145,35 @@ To use Google as the default video provider:
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Music generation
The bundled `google` plugin also registers music generation through the shared
`music_generate` tool.
- Default music model: `google/lyria-3-clip-preview`
- Also supports `google/lyria-3-pro-preview`
- Prompt controls: `lyrics` and `instrumental`
- Output format: `mp3` by default, plus `wav` on `google/lyria-3-pro-preview`
- Reference inputs: up to 10 images
- Session-backed runs detach through the shared task/status flow, including `action: "status"`
To use Google as the default music provider:
```json5
{
agents: {
defaults: {
musicGenerationModel: {
primary: "google/lyria-3-clip-preview",
},
},
},
}
```
See [Music Generation](/tools/music-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Environment note
If the Gateway runs as a daemon (launchd/systemd), make sure `GEMINI_API_KEY`

View File

@ -72,7 +72,7 @@ Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/Mattermost (plugi
- [Additional bundled variants](/providers/models#additional-bundled-provider-variants) - Anthropic Vertex, Copilot Proxy, and Gemini CLI OAuth
- [Image Generation](/tools/image-generation) - Shared `image_generate` tool, provider selection, and failover
- [Music Generation](/tools/music-generation) - Plugin-provided `music_generate` tool surfaces
- [Music Generation](/tools/music-generation) - Shared `music_generate` tool, provider selection, and failover
- [Video Generation](/tools/video-generation) - Shared `video_generate` tool, provider selection, and failover
## Transcription providers

View File

@ -14,6 +14,7 @@ MiniMax also provides:
- bundled speech synthesis via T2A v2
- bundled image understanding via `MiniMax-VL-01`
- bundled music generation via `music-2.5+`
- bundled `web_search` through the MiniMax Coding Plan search API
Provider split:
@ -66,6 +67,34 @@ through the plugin-owned `MiniMax-VL-01` media provider.
See [Image Generation](/tools/image-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Music generation
The bundled `minimax` plugin also registers music generation through the shared
`music_generate` tool.
- Default music model: `minimax/music-2.5+`
- Also supports `minimax/music-2.5` and `minimax/music-2.0`
- Prompt controls: `lyrics`, `instrumental`, `durationSeconds`
- Output format: `mp3`
- Session-backed runs detach through the shared task/status flow, including `action: "status"`
To use MiniMax as the default music provider:
```json5
{
agents: {
defaults: {
musicGenerationModel: {
primary: "minimax/music-2.5+",
},
},
},
}
```
See [Music Generation](/tools/music-generation) for the shared tool
parameters, provider selection, and failover behavior.
## Video generation
The bundled `minimax` plugin also registers video generation through the shared

View File

@ -66,6 +66,7 @@ These tools ship with OpenClaw and are available without installing any plugins:
| `nodes` | Discover and target paired devices | |
| `cron` / `gateway` | Manage scheduled jobs; inspect, patch, restart, or update the gateway | |
| `image` / `image_generate` | Analyze or generate images | [Image Generation](/tools/image-generation) |
| `music_generate` | Generate music tracks | [Music Generation](/tools/music-generation) |
| `video_generate` | Generate videos | [Video Generation](/tools/video-generation) |
| `tts` | One-shot text-to-speech conversion | [TTS](/tools/tts) |
| `sessions_*` / `subagents` / `agents_list` | Session management, status, and sub-agent orchestration | [Sub-agents](/tools/subagents) |
@ -73,6 +74,8 @@ These tools ship with OpenClaw and are available without installing any plugins:
For image work, use `image` for analysis and `image_generate` for generation or editing. If you target `openai/*`, `google/*`, `fal/*`, or another non-default image provider, configure that provider's auth/API key first.
For music work, use `music_generate`. If you target `google/*`, `minimax/*`, or another non-default music provider, configure that provider's auth/API key first.
For video work, use `video_generate`. If you target `qwen/*` or another non-default video provider, configure that provider's auth/API key first.
For workflow-driven audio generation, use `music_generate` when a plugin such as
@ -128,12 +131,12 @@ config. Deny always wins over allow.
`tools.profile` sets a base allowlist before `allow`/`deny` is applied.
Per-agent override: `agents.list[].tools.profile`.
| Profile | What it includes |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `full` | No restriction (same as unset) |
| `coding` | `group:fs`, `group:runtime`, `group:web`, `group:sessions`, `group:memory`, `cron`, `image`, `image_generate`, `video_generate` |
| `messaging` | `group:messaging`, `sessions_list`, `sessions_history`, `sessions_send`, `session_status` |
| `minimal` | `session_status` only |
| Profile | What it includes |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| `full` | No restriction (same as unset) |
| `coding` | `group:fs`, `group:runtime`, `group:web`, `group:sessions`, `group:memory`, `cron`, `image`, `image_generate`, `music_generate`, `video_generate` |
| `messaging` | `group:messaging`, `sessions_list`, `sessions_history`, `sessions_send`, `session_status` |
| `minimal` | `session_status` only |
### Tool groups
@ -151,7 +154,7 @@ Use `group:*` shorthands in allow/deny lists:
| `group:messaging` | message |
| `group:nodes` | nodes |
| `group:agents` | agents_list |
| `group:media` | image, image_generate, video_generate, tts |
| `group:media` | image, image_generate, music_generate, video_generate, tts |
| `group:openclaw` | All built-in OpenClaw tools (excludes plugin tools) |
`sessions_history` returns a bounded, safety-filtered recall view. It strips

View File

@ -1,23 +1,68 @@
---
summary: "Generate music or audio with plugin-provided tools such as ComfyUI workflows"
summary: "Generate music with shared providers or plugin-provided workflows"
read_when:
- Generating music or audio via the agent
- Configuring plugin-provided music generation tools
- Configuring music generation providers and models
- Understanding the music_generate tool parameters
title: "Music Generation"
---
# Music Generation
The `music_generate` tool lets the agent create audio files when a plugin
registers music generation support.
The `music_generate` tool lets the agent create music or audio through either:
The bundled `comfy` plugin currently provides `music_generate` using a
workflow-configured ComfyUI graph.
- the shared music-generation capability with configured providers such as Google and MiniMax
- plugin-provided tool surfaces such as a workflow-configured ComfyUI graph
For shared provider-backed agent sessions, OpenClaw starts music generation as a
background task, tracks it in the task ledger, then wakes the agent again when
the track is ready so the agent can post the finished audio back into the
original channel.
<Note>
The built-in shared tool only appears when at least one music-generation provider is available. If you don't see `music_generate` in your agent's tools, configure `agents.defaults.musicGenerationModel` or set up a provider API key.
</Note>
<Note>
Plugin-provided `music_generate` implementations can expose different parameters or runtime behavior. The async task/status flow below applies to the built-in shared provider-backed path.
</Note>
## Quick start
1. Configure `models.providers.comfy.music` with a workflow JSON and prompt/output nodes.
### Shared provider-backed generation
1. Set an API key for at least one provider, for example `GEMINI_API_KEY` or
`MINIMAX_API_KEY`.
2. Optionally set your preferred model:
```json5
{
agents: {
defaults: {
musicGenerationModel: {
primary: "google/lyria-3-clip-preview",
},
},
},
}
```
3. Ask the agent: _"Generate an upbeat synthpop track about a night drive
through a neon city."_
The agent calls `music_generate` automatically. No tool allow-listing needed.
For direct synchronous contexts without a session-backed agent run, the built-in
tool still falls back to inline generation and returns the final media path in
the tool result.
### Workflow-driven plugin generation
The bundled `comfy` plugin can also provide `music_generate` using a
workflow-configured ComfyUI graph.
1. Configure `models.providers.comfy.music` with a workflow JSON and
prompt/output nodes.
2. If you use Comfy Cloud, set `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY`.
3. Ask the agent for music or call the tool directly.
@ -27,22 +72,102 @@ Example:
/tool music_generate prompt="Warm ambient synth loop with soft tape texture"
```
## Tool parameters
## Shared bundled provider support
| Parameter | Type | Description |
| ---------- | ------ | --------------------------------------------------- |
| `prompt` | string | Music or audio generation prompt |
| `action` | string | `"generate"` (default) or `"list"` |
| `model` | string | Provider/model override. Currently `comfy/workflow` |
| `filename` | string | Output filename hint for the saved audio file |
| Provider | Default model | Reference inputs | Supported controls | API key |
| -------- | ---------------------- | ---------------- | --------------------------------------------------------- | ---------------------------------- |
| Google | `lyria-3-clip-preview` | Up to 10 images | `lyrics`, `instrumental`, `format` | `GEMINI_API_KEY`, `GOOGLE_API_KEY` |
| MiniMax | `music-2.5+` | None | `lyrics`, `instrumental`, `durationSeconds`, `format=mp3` | `MINIMAX_API_KEY` |
## Current provider support
## Plugin-provided support
| Provider | Model | Notes |
| -------- | ---------- | ------------------------------- |
| ComfyUI | `workflow` | Workflow-defined music or audio |
## Live test
Use `action: "list"` to inspect available shared providers and models at
runtime:
```text
/tool music_generate action=list
```
## Built-in tool parameters
| Parameter | Type | Description |
| ----------------- | -------- | ------------------------------------------------------------------------------------------------- |
| `prompt` | string | Music generation prompt (required for `action: "generate"`) |
| `action` | string | `"generate"` (default), `"status"` for the current session task, or `"list"` to inspect providers |
| `model` | string | Provider/model override, e.g. `google/lyria-3-pro-preview` or `comfy/workflow` |
| `lyrics` | string | Optional lyrics when the provider supports explicit lyric input |
| `instrumental` | boolean | Request instrumental-only output when the provider supports it |
| `image` | string | Single reference image path or URL |
| `images` | string[] | Multiple reference images (up to 10) |
| `durationSeconds` | number | Target duration in seconds when the provider supports duration hints |
| `format` | string | Output format hint (`mp3` or `wav`) when the provider supports it |
| `filename` | string | Output filename hint |
Not all providers or plugins support all parameters. The shared built-in tool
validates provider capability limits before it submits the request.
## Async behavior for the shared provider-backed path
- Session-backed agent runs: `music_generate` creates a background task, returns a started/task response immediately, and posts the finished track later in a follow-up agent message.
- Duplicate prevention: while that background task is still `queued` or `running`, later `music_generate` calls in the same session return task status instead of starting another generation.
- Status lookup: use `action: "status"` to inspect the active session-backed music task without starting a new one.
- Task tracking: use `openclaw tasks list` or `openclaw tasks show <taskId>` to inspect queued, running, and terminal status for the generation.
- Completion wake: OpenClaw injects an internal completion event back into the same session so the model can write the user-facing follow-up itself.
- Prompt hint: later user/manual turns in the same session get a small runtime hint when a music task is already in flight so the model does not blindly call `music_generate` again.
- No-session fallback: direct/local contexts without a real agent session still run inline and return the final audio result in the same turn.
## Configuration
### Model selection
```json5
{
agents: {
defaults: {
musicGenerationModel: {
primary: "google/lyria-3-clip-preview",
fallbacks: ["minimax/music-2.5+"],
},
},
},
}
```
### Provider selection order
When generating music, OpenClaw tries providers in this order:
1. `model` parameter from the tool call, if the agent specifies one
2. `musicGenerationModel.primary` from config
3. `musicGenerationModel.fallbacks` in order
4. Auto-detection using auth-backed provider defaults only:
- current default provider first
- remaining registered music-generation providers in provider-id order
If a provider fails, the next candidate is tried automatically. If all fail, the
error includes details from each attempt.
## Provider notes
- Google uses Lyria 3 batch generation. The current bundled flow supports
prompt, optional lyrics text, and optional reference images.
- MiniMax uses the batch `music_generation` endpoint. The current bundled flow
supports prompt, optional lyrics, instrumental mode, duration steering, and
mp3 output.
- ComfyUI support is workflow-driven and depends on the configured graph plus
node mapping for prompt/output fields.
## Live tests
Opt-in live coverage for the shared bundled providers:
```bash
OPENCLAW_LIVE_TEST=1 pnpm test:live -- extensions/music-generation-providers.live.test.ts
```
Opt-in live coverage for the bundled ComfyUI music path:
@ -50,10 +175,15 @@ Opt-in live coverage for the bundled ComfyUI music path:
OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts
```
The live file also covers comfy image and video workflows when those sections
are configured.
The Comfy live file also covers comfy image and video workflows when those
sections are configured.
## Related
- [Background Tasks](/automation/tasks) - task tracking for detached `music_generate` runs
- [Configuration Reference](/gateway/configuration-reference#agent-defaults) - `musicGenerationModel` config
- [ComfyUI](/providers/comfy)
- [Google (Gemini)](/providers/google)
- [MiniMax](/providers/minimax)
- [Models](/concepts/models) - model configuration and failover
- [Tools Overview](/tools)

View File

@ -319,6 +319,7 @@ Common registration methods:
| `registerRealtimeVoiceProvider` | Duplex realtime voice |
| `registerMediaUnderstandingProvider` | Image/audio analysis |
| `registerImageGenerationProvider` | Image generation |
| `registerMusicGenerationProvider` | Music generation |
| `registerVideoGenerationProvider` | Video generation |
| `registerWebFetchProvider` | Web fetch / scrape provider |
| `registerWebSearchProvider` | Web search |