From 42778ccd46bf727c07a236d9b36bacdb6ce54b74 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Sat, 4 Apr 2026 12:21:30 +0100 Subject: [PATCH] docs: refresh provider stream family refs --- docs/plugins/architecture.md | 20 +++++++++++++------- docs/plugins/sdk-overview.md | 2 +- docs/plugins/sdk-provider-plugins.md | 21 ++++++++++++++++++--- 3 files changed, 32 insertions(+), 11 deletions(-) diff --git a/docs/plugins/architecture.md b/docs/plugins/architecture.md index e26ff569e28..ceea1b6ae98 100644 --- a/docs/plugins/architecture.md +++ b/docs/plugins/architecture.md @@ -741,12 +741,14 @@ api.registerProvider({ synthetic Codex catalog rows, and ChatGPT usage endpoint integration. - Google AI Studio and Gemini CLI OAuth use `resolveDynamicModel`, `buildReplayPolicy`, `sanitizeReplayHistory`, - `resolveReasoningOutputMode`, and `isModernModelRef` because the + `resolveReasoningOutputMode`, `wrapStreamFn`, and `isModernModelRef` because the `google-gemini` replay family owns Gemini 3.1 forward-compat fallback, native Gemini replay validation, bootstrap replay sanitation, tagged - reasoning-output mode, and modern-model matching; Gemini CLI OAuth also uses - `formatApiKey`, `resolveUsageAuth`, and `fetchUsageSnapshot` for token - formatting, token parsing, and quota endpoint wiring. + reasoning-output mode, and modern-model matching, while the + `google-thinking` stream family owns Gemini thinking payload normalization; + Gemini CLI OAuth also uses `formatApiKey`, `resolveUsageAuth`, and + `fetchUsageSnapshot` for token formatting, token parsing, and quota endpoint + wiring. - Anthropic Vertex uses `buildReplayPolicy` through the `anthropic-by-model` replay family so Claude-specific replay cleanup stays scoped to Claude ids instead of every `anthropic-messages` transport. @@ -764,9 +766,12 @@ api.registerProvider({ `hybrid-anthropic-openai` replay family because one provider owns both Anthropic-message and OpenAI-compatible semantics; it keeps Claude-only thinking-block dropping on the Anthropic side while overriding reasoning - output mode back to native. + output mode back to native, and the `minimax-fast-mode` stream family owns + fast-mode model rewrites on the shared stream path. - Moonshot uses `catalog` plus `wrapStreamFn` because it still uses the shared - OpenAI transport but needs provider-owned thinking payload normalization. + OpenAI transport but needs provider-owned thinking payload normalization; the + `moonshot-thinking` stream family maps config plus `/think` state onto its + native binary thinking payload. - Kilocode uses `catalog`, `capabilities`, `wrapStreamFn`, and `isCacheTtlEligible` because it needs provider-owned request headers, reasoning payload normalization, Gemini transcript hints, and Anthropic @@ -775,7 +780,8 @@ api.registerProvider({ `isCacheTtlEligible`, `isBinaryThinking`, `isModernModelRef`, `resolveUsageAuth`, and `fetchUsageSnapshot` because it owns GLM-5 fallback, `tool_stream` defaults, binary thinking UX, modern-model matching, and both - usage auth + quota fetching. + usage auth + quota fetching; the `tool-stream-default-on` stream family keeps + the default-on `tool_stream` wrapper out of per-provider handwritten glue. - Mistral, OpenCode Zen, and OpenCode Go use `capabilities` only to keep transcript/tooling quirks out of core. - Catalog-only bundled providers such as `byteplus`, `cloudflare-ai-gateway`, diff --git a/docs/plugins/sdk-overview.md b/docs/plugins/sdk-overview.md index a9ced1155a5..11972b47820 100644 --- a/docs/plugins/sdk-overview.md +++ b/docs/plugins/sdk-overview.md @@ -83,7 +83,7 @@ subpaths is in `scripts/lib/plugin-sdk-entrypoints.json`. | `plugin-sdk/provider-catalog-shared` | `findCatalogTemplate`, `buildSingleProviderApiKeyCatalog`, `supportsNativeStreamingUsageCompat`, `applyProviderNativeStreamingUsageCompat` | | `plugin-sdk/provider-tools` | `buildProviderToolCompatFamilyHooks`, Gemini schema helpers | | `plugin-sdk/provider-usage` | `fetchClaudeUsage` and similar | - | `plugin-sdk/provider-stream` | Stream wrapper types + provider stream wrappers | + | `plugin-sdk/provider-stream` | `buildProviderStreamFamilyHooks`, stream wrapper types, provider stream wrappers | | `plugin-sdk/provider-onboard` | Onboarding config patch helpers | | `plugin-sdk/global-singleton` | Process-local singleton/map/cache helpers | diff --git a/docs/plugins/sdk-provider-plugins.md b/docs/plugins/sdk-provider-plugins.md index e7fa9188262..ec5c47fbb29 100644 --- a/docs/plugins/sdk-provider-plugins.md +++ b/docs/plugins/sdk-provider-plugins.md @@ -255,13 +255,12 @@ API key auth, and dynamic model resolution. ```typescript import { buildProviderReplayFamilyHooks } from "openclaw/plugin-sdk/provider-model-shared"; + import { buildProviderStreamFamilyHooks } from "openclaw/plugin-sdk/provider-stream"; import { buildProviderToolCompatFamilyHooks } from "openclaw/plugin-sdk/provider-tools"; - import { createGoogleThinkingPayloadWrapper } from "openclaw/plugin-sdk/provider-stream"; const GOOGLE_FAMILY_HOOKS = { ...buildProviderReplayFamilyHooks({ family: "google-gemini" }), - wrapStreamFn: (ctx) => - createGoogleThinkingPayloadWrapper(ctx.streamFn, ctx.thinkingLevel), + ...buildProviderStreamFamilyHooks("google-thinking"), ...buildProviderToolCompatFamilyHooks("gemini"), }; @@ -290,6 +289,22 @@ API key auth, and dynamic model resolution. - `minimax`: `hybrid-anthropic-openai` - `moonshot`, `ollama`, `xai`, and `zai`: `openai-compatible` + Available stream families today: + + | Family | What it wires in | + | --- | --- | + | `google-thinking` | Gemini thinking payload normalization on the shared stream path | + | `moonshot-thinking` | Moonshot binary native-thinking payload mapping from config + `/think` level | + | `minimax-fast-mode` | MiniMax fast-mode model rewrite on the shared stream path | + | `tool-stream-default-on` | Default-on `tool_stream` wrapper for providers like Z.AI that want tool streaming unless explicitly disabled | + + Real bundled examples: + + - `google` and `google-gemini-cli`: `google-thinking` + - `moonshot`: `moonshot-thinking` + - `minimax` and `minimax-portal`: `minimax-fast-mode` + - `zai`: `tool-stream-default-on` + For providers that need a token exchange before each inference call: