docs: refresh provider stream family refs

2026-04-04 12:21:30 +01:00 · 2026-04-04 12:21:30 +01:00 · 42778ccd46
parent 9615488855
commit 42778ccd46
3 changed files with 32 additions and 11 deletions
--- a/docs/plugins/architecture.md
+++ b/docs/plugins/architecture.md
@ -741,12 +741,14 @@ api.registerProvider({
  synthetic Codex catalog rows, and ChatGPT usage endpoint integration.
 - Google AI Studio and Gemini CLI OAuth use `resolveDynamicModel`,
  `buildReplayPolicy`, `sanitizeReplayHistory`,
-  `resolveReasoningOutputMode`, and `isModernModelRef` because the
+  `resolveReasoningOutputMode`, `wrapStreamFn`, and `isModernModelRef` because the
  `google-gemini` replay family owns Gemini 3.1 forward-compat fallback,
  native Gemini replay validation, bootstrap replay sanitation, tagged
-  reasoning-output mode, and modern-model matching; Gemini CLI OAuth also uses
-  `formatApiKey`, `resolveUsageAuth`, and `fetchUsageSnapshot` for token
-  formatting, token parsing, and quota endpoint wiring.
+  reasoning-output mode, and modern-model matching, while the
+  `google-thinking` stream family owns Gemini thinking payload normalization;
+  Gemini CLI OAuth also uses `formatApiKey`, `resolveUsageAuth`, and
+  `fetchUsageSnapshot` for token formatting, token parsing, and quota endpoint
+  wiring.
 - Anthropic Vertex uses `buildReplayPolicy` through the
  `anthropic-by-model` replay family so Claude-specific replay cleanup stays
  scoped to Claude ids instead of every `anthropic-messages` transport.
@ -764,9 +766,12 @@ api.registerProvider({
  `hybrid-anthropic-openai` replay family because one provider owns both
  Anthropic-message and OpenAI-compatible semantics; it keeps Claude-only
  thinking-block dropping on the Anthropic side while overriding reasoning
-  output mode back to native.
+  output mode back to native, and the `minimax-fast-mode` stream family owns
+  fast-mode model rewrites on the shared stream path.
 - Moonshot uses `catalog` plus `wrapStreamFn` because it still uses the shared
-  OpenAI transport but needs provider-owned thinking payload normalization.
+  OpenAI transport but needs provider-owned thinking payload normalization; the
+  `moonshot-thinking` stream family maps config plus `/think` state onto its
+  native binary thinking payload.
 - Kilocode uses `catalog`, `capabilities`, `wrapStreamFn`, and
  `isCacheTtlEligible` because it needs provider-owned request headers,
  reasoning payload normalization, Gemini transcript hints, and Anthropic
@ -775,7 +780,8 @@ api.registerProvider({
  `isCacheTtlEligible`, `isBinaryThinking`, `isModernModelRef`,
  `resolveUsageAuth`, and `fetchUsageSnapshot` because it owns GLM-5 fallback,
  `tool_stream` defaults, binary thinking UX, modern-model matching, and both
-  usage auth + quota fetching.
+  usage auth + quota fetching; the `tool-stream-default-on` stream family keeps
+  the default-on `tool_stream` wrapper out of per-provider handwritten glue.
 - Mistral, OpenCode Zen, and OpenCode Go use `capabilities` only to keep
  transcript/tooling quirks out of core.
 - Catalog-only bundled providers such as `byteplus`, `cloudflare-ai-gateway`,
--- a/docs/plugins/sdk-overview.md
+++ b/docs/plugins/sdk-overview.md
@ -83,7 +83,7 @@ subpaths is in `scripts/lib/plugin-sdk-entrypoints.json`.
    | `plugin-sdk/provider-catalog-shared` | `findCatalogTemplate`, `buildSingleProviderApiKeyCatalog`, `supportsNativeStreamingUsageCompat`, `applyProviderNativeStreamingUsageCompat` |
    | `plugin-sdk/provider-tools` | `buildProviderToolCompatFamilyHooks`, Gemini schema helpers |
    | `plugin-sdk/provider-usage` | `fetchClaudeUsage` and similar |
-    | `plugin-sdk/provider-stream` | Stream wrapper types + provider stream wrappers |
+    | `plugin-sdk/provider-stream` | `buildProviderStreamFamilyHooks`, stream wrapper types, provider stream wrappers |
    | `plugin-sdk/provider-onboard` | Onboarding config patch helpers |
    | `plugin-sdk/global-singleton` | Process-local singleton/map/cache helpers |
  </Accordion>
--- a/docs/plugins/sdk-provider-plugins.md
+++ b/docs/plugins/sdk-provider-plugins.md
@ -255,13 +255,12 @@ API key auth, and dynamic model resolution.

    ```typescript
    import { buildProviderReplayFamilyHooks } from "openclaw/plugin-sdk/provider-model-shared";
+    import { buildProviderStreamFamilyHooks } from "openclaw/plugin-sdk/provider-stream";
    import { buildProviderToolCompatFamilyHooks } from "openclaw/plugin-sdk/provider-tools";
-    import { createGoogleThinkingPayloadWrapper } from "openclaw/plugin-sdk/provider-stream";

    const GOOGLE_FAMILY_HOOKS = {
      ...buildProviderReplayFamilyHooks({ family: "google-gemini" }),
-      wrapStreamFn: (ctx) =>
-        createGoogleThinkingPayloadWrapper(ctx.streamFn, ctx.thinkingLevel),
+      ...buildProviderStreamFamilyHooks("google-thinking"),
      ...buildProviderToolCompatFamilyHooks("gemini"),
    };

@ -290,6 +289,22 @@ API key auth, and dynamic model resolution.
    - `minimax`: `hybrid-anthropic-openai`
    - `moonshot`, `ollama`, `xai`, and `zai`: `openai-compatible`

+    Available stream families today:
+
+    | Family | What it wires in |
+    | --- | --- |
+    | `google-thinking` | Gemini thinking payload normalization on the shared stream path |
+    | `moonshot-thinking` | Moonshot binary native-thinking payload mapping from config + `/think` level |
+    | `minimax-fast-mode` | MiniMax fast-mode model rewrite on the shared stream path |
+    | `tool-stream-default-on` | Default-on `tool_stream` wrapper for providers like Z.AI that want tool streaming unless explicitly disabled |
+
+    Real bundled examples:
+
+    - `google` and `google-gemini-cli`: `google-thinking`
+    - `moonshot`: `moonshot-thinking`
+    - `minimax` and `minimax-portal`: `minimax-fast-mode`
+    - `zai`: `tool-stream-default-on`
+
    <Tabs>
      <Tab title="Token exchange">
        For providers that need a token exchange before each inference call: