Commit Graph

7 Commits

Author SHA1 Message Date
ShengtongZhu e55c4c4044 fix(guardian): resolve well-known provider baseUrl from pi-ai model database
When a provider (e.g. anthropic, openai) is not explicitly configured in
openclaw.json, fall back to pi-ai's built-in model database to resolve
baseUrl and api type. This avoids requiring users to manually configure
well-known providers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 19:33:14 +08:00
ShengtongZhu 474a41a3ee fix(guardian): use openclaw/plugin-sdk/core instead of monolithic import
Bundled plugins must use scoped plugin-sdk imports (e.g. /core, /compat)
instead of the monolithic openclaw/plugin-sdk entry point.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 14:11:46 +08:00
ShengtongZhu 9fbbc97e9a fix(guardian): use runtime.modelAuth instead of runtime.models
Align with main's PluginRuntime interface: use `modelAuth` (not `models`)
for API key resolution. Remove dependency on `resolveProviderInfo` (not
available on main) — provider info is now resolved from config at
registration time via `resolveModelFromConfig`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 14:06:59 +08:00
ShengtongZhu 2e2eed339a refactor(guardian): replace async instruction extraction with full system prompt caching
Remove the LLM-based standingInstructions and availableSkills extraction
pipeline. Instead, cache the main agent's full system prompt on the first
llm_input and pass it as-is to the guardian as "Agent context".

This eliminates two async LLM calls per session, simplifies the codebase
(~340 lines removed), and gives the guardian MORE context (the complete
system prompt including tool definitions, memory, and skills) rather than
a lossy LLM-extracted summary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 12:33:28 +08:00
ShengtongZhu 6a3220b0c6 feat(guardian): enhance context awareness and add conversation summarization
- Add rolling conversation summary generation to provide long-term context without token waste
- Extract standing instructions and available skills from system prompt for better decision context
- Support thinking block extraction for reasoning model responses (e.g. kimi-coding)
- Add config options for context tools, recent turns, and tool result length
- Implement lazy context extraction with live message array reference
- Skip guardian review for system triggers (heartbeat, cron)
- Improve error handling for abort race conditions and timeout scenarios
- Normalize headers in model-auth to handle secret inputs consistently
- Update documentation with comprehensive usage guide and security model
2026-03-15 12:32:47 +08:00
Albert 1c6b5d7b72 refactor(guardian): use pi-ai completeSimple, improve prompt and logging
- Replace 3 raw fetch() API call functions (OpenAI, Anthropic, Google)
  with a single pi-ai completeSimple() call, ensuring consistent HTTP
  behavior (User-Agent, auth, retry) with the main model
- Remove authMode field — pi-ai auto-detects OAuth from API key prefix
- Rewrite system prompt for strict single-line output format, add
  "Do NOT change your mind" and "Do NOT output reasoning" constraints
- Move decision guidelines to system prompt, add multi-step workflow
  awareness (intermediate read steps should be ALLOWed)
- Simplify user prompt — remove inline examples and criteria
- Use forward scanning in parseGuardianResponse for security (model's
  verdict appears first, attacker-injected text appears after)
- Add prominent BLOCK logging via logger.error with full conversation
  context dump (████ banner, all turns, tool arguments)
- Remove 800-char assistant message truncation limit
- Increase default max_user_messages from 3 to 10

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 12:32:34 +08:00
Albert ba28dbc016 feat(guardian): add LLM-based intent-alignment guardian plugin
Guardian intercepts tool calls via before_tool_call hook and sends them
to a separate LLM for review — blocks actions the user never requested,
defending against prompt injection attacks.

Key design decisions:
- Conversation turns (user + assistant pairs) give guardian context to
  understand confirmations like "yes" / "go ahead"
- Assistant replies are explicitly marked as untrusted in the prompt to
  prevent poisoning attacks from propagating
- Provider resolution uses SDK (not hardcoded list) with 3-layer
  fallback: explicit config → models.json → pi-ai built-in database
- Lazy resolution pattern for async provider/auth lookup in sync register()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 12:32:34 +08:00