docs: refresh failover and compaction refs

This commit is contained in:
Peter Steinberger 2026-04-04 14:44:42 +01:00
parent bbb73d3171
commit 73584b1d33
No known key found for this signature in database
5 changed files with 27 additions and 3 deletions

View File

@ -25,7 +25,9 @@ model sees on the next turn.
Auto-compaction is on by default. It runs when the session nears the context
limit, or when the model returns a context-overflow error (in which case
OpenClaw compacts and retries).
OpenClaw compacts and retries). Typical overflow signatures include
`request_too_large`, `context length exceeded`, `input exceeds the maximum
number of tokens`, and `input is too long for the model`.
<Info>
Before compacting, OpenClaw automatically reminds the agent to save important

View File

@ -120,6 +120,10 @@ If you have both an OAuth profile and an API key profile for the same provider,
When a profile fails due to auth/ratelimit errors (or a timeout that looks
like rate limiting), OpenClaw marks it in cooldown and moves to the next profile.
That rate-limit bucket is broader than plain `429`: it also includes provider
messages such as `Too many concurrent requests`, `ThrottlingException`,
`throttled`, `resource exhausted`, and periodic usage-window limits such as
`weekly/monthly limit reached`.
Format/invalidrequest errors (for example Cloud Code Assist tool call ID
validation failures) are treated as failoverworthy and use the same cooldowns.
OpenAI-compatible stop-reason errors such as `Unhandled stop reason: error`,
@ -223,6 +227,8 @@ Model fallback does not continue on:
- explicit aborts that are not timeout/failover-shaped
- context overflow errors that should stay inside compaction/retry logic
(for example `request_too_large`, `INVALID_ARGUMENT: input exceeds the maximum
number of tokens`, or `The input is too long for the model`)
- a final unknown error when there are no candidates left
### Cooldown skip vs probe behavior

View File

@ -154,7 +154,10 @@ surface.
- `<PROVIDER>_API_KEY_*` (numbered list, e.g. `<PROVIDER>_API_KEY_1`)
- For Google providers, `GOOGLE_API_KEY` is also included as fallback.
- Key selection order preserves priority and deduplicates values.
- Requests are retried with the next key only on rate-limit responses (for example `429`, `rate_limit`, `quota`, `resource exhausted`).
- Requests are retried with the next key only on rate-limit responses (for
example `429`, `rate_limit`, `quota`, `resource exhausted`, `Too many
concurrent requests`, `ThrottlingException`, or periodic usage-limit
messages).
- Non-rate-limit failures fail immediately; no key rotation is attempted.
- When all candidate keys fail, the final error is returned from the last attempt.

View File

@ -2362,6 +2362,16 @@ for usage/billing and raise limits as needed.
Cooldowns apply to failing profiles (exponential backoff), so OpenClaw can keep responding even when a provider is rate-limited or temporarily failing.
The rate-limit bucket includes more than plain `429` responses. OpenClaw
also treats messages like `Too many concurrent requests`,
`ThrottlingException`, `resource exhausted`, and periodic usage-window
limits (`weekly/monthly limit reached`) as failover-worthy rate limits.
Context-overflow errors are different: signatures such as
`request_too_large`, `input exceeds the maximum number of tokens`, or
`input is too long for the model` stay on the compaction/retry path instead
of advancing model fallback.
</Accordion>
<Accordion title='What does "No credentials found for profile anthropic:default" mean?'>

View File

@ -216,7 +216,10 @@ Compaction is **persistent** (unlike session pruning). See [/concepts/session-pr
In the embedded Pi agent, auto-compaction triggers in two cases:
1. **Overflow recovery**: the model returns a context overflow error → compact → retry.
1. **Overflow recovery**: the model returns a context overflow error
(`request_too_large`, `context length exceeded`, `input exceeds the maximum
number of tokens`, `input is too long for the model`, and similar
provider-shaped variants) → compact → retry.
2. **Threshold maintenance**: after a successful turn, when:
`contextTokens > contextWindow - reserveTokens`