mirror of https://github.com/openclaw/openclaw.git
docs: expand model fallback guide
This commit is contained in:
parent
5bea93fd63
commit
d01cb5ecc6
|
|
@ -3,6 +3,7 @@ summary: "How OpenClaw rotates auth profiles and falls back across models"
|
|||
read_when:
|
||||
- Diagnosing auth profile rotation, cooldowns, or model fallback behavior
|
||||
- Updating failover rules for auth profiles or models
|
||||
- Understanding how session model overrides interact with fallback retries
|
||||
title: "Model Failover"
|
||||
---
|
||||
|
||||
|
|
@ -15,6 +16,44 @@ OpenClaw handles failures in two stages:
|
|||
|
||||
This doc explains the runtime rules and the data that backs them.
|
||||
|
||||
## Runtime flow
|
||||
|
||||
For a normal text run, OpenClaw evaluates candidates in this order:
|
||||
|
||||
1. The currently selected session model.
|
||||
2. Configured `agents.defaults.model.fallbacks` in order.
|
||||
3. The configured primary model at the end when the run started from an override.
|
||||
|
||||
Inside each candidate, OpenClaw tries auth-profile failover before advancing to
|
||||
the next model candidate.
|
||||
|
||||
High-level sequence:
|
||||
|
||||
1. Resolve the active session model and auth-profile preference.
|
||||
2. Build the model candidate chain.
|
||||
3. Try the current provider with auth-profile rotation/cooldown rules.
|
||||
4. If that provider is exhausted with a failover-worthy error, move to the next
|
||||
model candidate.
|
||||
5. Persist the selected fallback override before the retry starts so other
|
||||
session readers see the same provider/model the runner is about to use.
|
||||
6. If the fallback candidate fails, roll back only the fallback-owned session
|
||||
override fields when they still match that failed candidate.
|
||||
7. If every candidate fails, throw a `FallbackSummaryError` with per-attempt
|
||||
detail and the soonest cooldown expiry when one is known.
|
||||
|
||||
This is intentionally narrower than "save and restore the whole session". The
|
||||
reply runner only persists the model-selection fields it owns for fallback:
|
||||
|
||||
- `providerOverride`
|
||||
- `modelOverride`
|
||||
- `authProfileOverride`
|
||||
- `authProfileOverrideSource`
|
||||
- `authProfileOverrideCompactionCount`
|
||||
|
||||
That prevents a failed fallback retry from overwriting newer unrelated session
|
||||
mutations such as manual `/model` changes or session rotation updates that
|
||||
happened while the attempt was running.
|
||||
|
||||
## Auth storage (keys + OAuth)
|
||||
|
||||
OpenClaw uses **auth profiles** for both API keys and OAuth tokens.
|
||||
|
|
@ -148,6 +187,102 @@ with `auth.cooldowns.overloadedProfileRotations`,
|
|||
When a run starts with a model override (hooks or CLI), fallbacks still end at
|
||||
`agents.defaults.model.primary` after trying any configured fallbacks.
|
||||
|
||||
### Candidate chain rules
|
||||
|
||||
OpenClaw builds the candidate list from the currently requested `provider/model`
|
||||
plus configured fallbacks.
|
||||
|
||||
Rules:
|
||||
|
||||
- The requested model is always first.
|
||||
- Explicit configured fallbacks are deduplicated but not filtered by the model
|
||||
allowlist. They are treated as explicit operator intent.
|
||||
- If the current run is already on a configured fallback in the same provider
|
||||
family, OpenClaw keeps using the full configured chain.
|
||||
- If the current run is on a different provider than config and that current
|
||||
model is not already part of the configured fallback chain, OpenClaw does not
|
||||
append unrelated configured fallbacks from another provider.
|
||||
- When the run started from an override, the configured primary is appended at
|
||||
the end so the chain can settle back onto the normal default once earlier
|
||||
candidates are exhausted.
|
||||
|
||||
### Which errors advance fallback
|
||||
|
||||
Model fallback continues on:
|
||||
|
||||
- auth failures
|
||||
- rate limits and cooldown exhaustion
|
||||
- overloaded/provider-busy errors
|
||||
- timeout-shaped failover errors
|
||||
- billing disables
|
||||
- `LiveSessionModelSwitchError`, which is normalized into a failover path so a
|
||||
stale persisted model does not create an outer retry loop
|
||||
- other unrecognized errors when there are still remaining candidates
|
||||
|
||||
Model fallback does not continue on:
|
||||
|
||||
- explicit aborts that are not timeout/failover-shaped
|
||||
- context overflow errors that should stay inside compaction/retry logic
|
||||
- a final unknown error when there are no candidates left
|
||||
|
||||
### Cooldown skip vs probe behavior
|
||||
|
||||
When every auth profile for a provider is already in cooldown, OpenClaw does
|
||||
not automatically skip that provider forever. It makes a per-candidate decision:
|
||||
|
||||
- Persistent auth failures skip the whole provider immediately.
|
||||
- Billing disables usually skip, but the primary candidate can still be probed
|
||||
on a throttle so recovery is possible without restarting.
|
||||
- The primary candidate may be probed near cooldown expiry, with a per-provider
|
||||
throttle.
|
||||
- Same-provider fallback siblings can be attempted despite cooldown when the
|
||||
failure looks transient (`rate_limit`, `overloaded`, or unknown).
|
||||
- Transient cooldown probes are limited to one per provider per fallback run so
|
||||
a single provider does not stall cross-provider fallback.
|
||||
|
||||
## Session overrides and live model switching
|
||||
|
||||
Session model changes are shared state. The active runner, `/model` command,
|
||||
compaction/session updates, and live-session reconciliation all read or write
|
||||
parts of the same session entry.
|
||||
|
||||
That means fallback retries have to coordinate with live model switching:
|
||||
|
||||
- Before a fallback retry starts, the reply runner persists the selected
|
||||
fallback override fields to the session entry.
|
||||
- Live-session reconciliation prefers persisted session overrides over stale
|
||||
runtime model fields.
|
||||
- If the fallback attempt fails, the runner rolls back only the override fields
|
||||
it wrote, and only if they still match that failed candidate.
|
||||
|
||||
This prevents the classic race:
|
||||
|
||||
1. Primary fails.
|
||||
2. Fallback candidate is chosen in memory.
|
||||
3. Session store still says the old primary.
|
||||
4. Live-session reconciliation reads the stale session state.
|
||||
5. The retry gets snapped back to the old model before the fallback attempt
|
||||
starts.
|
||||
|
||||
The persisted fallback override closes that window, and the narrow rollback
|
||||
keeps newer manual or runtime session changes intact.
|
||||
|
||||
## Observability and failure summaries
|
||||
|
||||
`runWithModelFallback(...)` records per-attempt details that feed logs and
|
||||
user-facing cooldown messaging:
|
||||
|
||||
- provider/model attempted
|
||||
- reason (`rate_limit`, `overloaded`, `billing`, `auth`, `model_not_found`, and
|
||||
similar failover reasons)
|
||||
- optional status/code
|
||||
- human-readable error summary
|
||||
|
||||
When every candidate fails, OpenClaw throws `FallbackSummaryError`. The outer
|
||||
reply runner can use that to build a more specific message such as "all models
|
||||
are temporarily rate-limited" and include the soonest cooldown expiry when one
|
||||
is known.
|
||||
|
||||
## Related config
|
||||
|
||||
See [Gateway configuration](/gateway/configuration) for:
|
||||
|
|
|
|||
|
|
@ -16,6 +16,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
|
|||
- Model refs use `provider/model` (example: `opencode/claude-opus-4-6`).
|
||||
- If you set `agents.defaults.models`, it becomes the allowlist.
|
||||
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
|
||||
- Fallback runtime rules, cooldown probes, and session-override persistence are
|
||||
documented in [/concepts/model-failover](/concepts/model-failover).
|
||||
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
|
||||
OpenClaw merges that output into `models.providers` before writing
|
||||
`models.json`.
|
||||
|
|
|
|||
Loading…
Reference in New Issue