diff --git a/docs/concepts/model-failover.md b/docs/concepts/model-failover.md index a743190d848..18cbcb5120b 100644 --- a/docs/concepts/model-failover.md +++ b/docs/concepts/model-failover.md @@ -155,6 +155,13 @@ State is stored in `auth-profiles.json` under `usageStats`: Billing/credit failures (for example “insufficient credits” / “credit balance too low”) are treated as failover‑worthy, but they’re usually not transient. Instead of a short cooldown, OpenClaw marks the profile as **disabled** (with a longer backoff) and rotates to the next profile/provider. +Not every HTTP `402` lands here. OpenClaw classifies temporary `402` usage-window +and organization/workspace spend-limit errors as `rate_limit` when the message +looks retryable (for example `weekly usage limit exhausted`, `daily limit +reached, resets tomorrow`, or `organization spending limit exceeded`). Those +stay on the short cooldown/failover path instead of the long billing-disable +path. + State is stored in `auth-profiles.json`: ```json diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index c3339a380a6..f6c625b3a57 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -3129,7 +3129,10 @@ Notes: } ``` -- `billingBackoffHours`: base backoff in hours when a profile fails due to billing/insufficient credits (default: `5`). +- `billingBackoffHours`: base backoff in hours when a profile fails due to true + billing/insufficient-credit errors (default: `5`). Retryable HTTP `402` + usage-window or organization/workspace spend-limit messages stay in the + `rate_limit` path instead. - `billingBackoffHoursByProvider`: optional per-provider overrides for billing backoff hours. - `billingMaxHours`: cap in hours for billing backoff exponential growth (default: `24`). - `authPermanentBackoffMinutes`: base backoff in minutes for high-confidence `auth_permanent` failures (default: `10`). diff --git a/docs/help/faq.md b/docs/help/faq.md index 53d6eb986b6..32808c93881 100644 --- a/docs/help/faq.md +++ b/docs/help/faq.md @@ -2367,6 +2367,11 @@ for usage/billing and raise limits as needed. `ThrottlingException`, `resource exhausted`, and periodic usage-window limits (`weekly/monthly limit reached`) as failover-worthy rate limits. + Some HTTP `402` responses also stay in that transient bucket. If the + message looks like a retryable usage-window or organization/workspace spend + limit (`daily limit reached, resets tomorrow`, `organization spending limit + exceeded`), OpenClaw treats it as `rate_limit`, not a long billing disable. + Context-overflow errors are different: signatures such as `request_too_large`, `input exceeds the maximum number of tokens`, or `input is too long for the model` stay on the compaction/retry path instead