mirror of https://github.com/openclaw/openclaw.git
docs: refresh failover and compaction refs
This commit is contained in:
parent
bbb73d3171
commit
73584b1d33
|
|
@ -25,7 +25,9 @@ model sees on the next turn.
|
|||
|
||||
Auto-compaction is on by default. It runs when the session nears the context
|
||||
limit, or when the model returns a context-overflow error (in which case
|
||||
OpenClaw compacts and retries).
|
||||
OpenClaw compacts and retries). Typical overflow signatures include
|
||||
`request_too_large`, `context length exceeded`, `input exceeds the maximum
|
||||
number of tokens`, and `input is too long for the model`.
|
||||
|
||||
<Info>
|
||||
Before compacting, OpenClaw automatically reminds the agent to save important
|
||||
|
|
|
|||
|
|
@ -120,6 +120,10 @@ If you have both an OAuth profile and an API key profile for the same provider,
|
|||
|
||||
When a profile fails due to auth/rate‑limit errors (or a timeout that looks
|
||||
like rate limiting), OpenClaw marks it in cooldown and moves to the next profile.
|
||||
That rate-limit bucket is broader than plain `429`: it also includes provider
|
||||
messages such as `Too many concurrent requests`, `ThrottlingException`,
|
||||
`throttled`, `resource exhausted`, and periodic usage-window limits such as
|
||||
`weekly/monthly limit reached`.
|
||||
Format/invalid‑request errors (for example Cloud Code Assist tool call ID
|
||||
validation failures) are treated as failover‑worthy and use the same cooldowns.
|
||||
OpenAI-compatible stop-reason errors such as `Unhandled stop reason: error`,
|
||||
|
|
@ -223,6 +227,8 @@ Model fallback does not continue on:
|
|||
|
||||
- explicit aborts that are not timeout/failover-shaped
|
||||
- context overflow errors that should stay inside compaction/retry logic
|
||||
(for example `request_too_large`, `INVALID_ARGUMENT: input exceeds the maximum
|
||||
number of tokens`, or `The input is too long for the model`)
|
||||
- a final unknown error when there are no candidates left
|
||||
|
||||
### Cooldown skip vs probe behavior
|
||||
|
|
|
|||
|
|
@ -154,7 +154,10 @@ surface.
|
|||
- `<PROVIDER>_API_KEY_*` (numbered list, e.g. `<PROVIDER>_API_KEY_1`)
|
||||
- For Google providers, `GOOGLE_API_KEY` is also included as fallback.
|
||||
- Key selection order preserves priority and deduplicates values.
|
||||
- Requests are retried with the next key only on rate-limit responses (for example `429`, `rate_limit`, `quota`, `resource exhausted`).
|
||||
- Requests are retried with the next key only on rate-limit responses (for
|
||||
example `429`, `rate_limit`, `quota`, `resource exhausted`, `Too many
|
||||
concurrent requests`, `ThrottlingException`, or periodic usage-limit
|
||||
messages).
|
||||
- Non-rate-limit failures fail immediately; no key rotation is attempted.
|
||||
- When all candidate keys fail, the final error is returned from the last attempt.
|
||||
|
||||
|
|
|
|||
|
|
@ -2362,6 +2362,16 @@ for usage/billing and raise limits as needed.
|
|||
|
||||
Cooldowns apply to failing profiles (exponential backoff), so OpenClaw can keep responding even when a provider is rate-limited or temporarily failing.
|
||||
|
||||
The rate-limit bucket includes more than plain `429` responses. OpenClaw
|
||||
also treats messages like `Too many concurrent requests`,
|
||||
`ThrottlingException`, `resource exhausted`, and periodic usage-window
|
||||
limits (`weekly/monthly limit reached`) as failover-worthy rate limits.
|
||||
|
||||
Context-overflow errors are different: signatures such as
|
||||
`request_too_large`, `input exceeds the maximum number of tokens`, or
|
||||
`input is too long for the model` stay on the compaction/retry path instead
|
||||
of advancing model fallback.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title='What does "No credentials found for profile anthropic:default" mean?'>
|
||||
|
|
|
|||
|
|
@ -216,7 +216,10 @@ Compaction is **persistent** (unlike session pruning). See [/concepts/session-pr
|
|||
|
||||
In the embedded Pi agent, auto-compaction triggers in two cases:
|
||||
|
||||
1. **Overflow recovery**: the model returns a context overflow error → compact → retry.
|
||||
1. **Overflow recovery**: the model returns a context overflow error
|
||||
(`request_too_large`, `context length exceeded`, `input exceeds the maximum
|
||||
number of tokens`, `input is too long for the model`, and similar
|
||||
provider-shaped variants) → compact → retry.
|
||||
2. **Threshold maintenance**: after a successful turn, when:
|
||||
|
||||
`contextTokens > contextWindow - reserveTokens`
|
||||
|
|
|
|||
Loading…
Reference in New Issue