Commit Graph

2720 Commits

Author SHA1 Message Date
Peter Steinberger 43f318cd9a fix(agents): reduce billing false positives on long text (#25680)
Land PR #25680 from @lairtonlelis.
Retain explicit status/code/http 402 detection for oversized structured payloads.

Co-authored-by: Ailton <lairton@telnyx.com>
2026-02-25 01:22:17 +00:00
Peter Steinberger bd213cf2ad fix(agents): normalize SiliconFlow Pro thinking=off payload (#25435)
Land PR #25435 from @Zjianru.
Changelog: add 2026.2.24 fix entry with contributor credit.

Co-authored-by: codez <codezhujr@gmail.com>
2026-02-25 01:11:34 +00:00
Peter Steinberger 5c6b2cbc8e refactor: extract iMessage echo cache and unify suppression guards 2026-02-25 00:53:39 +00:00
Peter Steinberger 2a11c09a8d fix: harden iMessage echo dedupe and reasoning suppression (#25897) 2026-02-25 00:46:56 +00:00
Vincent Koc f34325ec01 Tests: cover allowlist refs missing from catalog 2026-02-24 19:16:02 -05:00
Vincent Koc e9068e2571 Agents: trust explicit allowlist refs beyond catalog 2026-02-24 19:16:02 -05:00
Vincent Koc aee38c42d3 Tests: preserve OpenRouter explicit auth order under cooldown fields 2026-02-24 19:12:08 -05:00
Vincent Koc 06f0b4a193 Tests: keep OpenRouter runnable with legacy cooldown markers 2026-02-24 19:12:08 -05:00
Vincent Koc ebc8c4b609 Tests: skip OpenRouter failure cooldown persistence 2026-02-24 19:12:08 -05:00
Vincent Koc 5de04960a0 Tests: cover OpenRouter cooldown display bypass 2026-02-24 19:12:08 -05:00
Vincent Koc f1d5c1a31f Auth: use cooldown helper in explicit profile order 2026-02-24 19:12:08 -05:00
Vincent Koc daa4f34ce8 Auth: bypass cooldown tracking for OpenRouter 2026-02-24 19:12:08 -05:00
Fred White b7deb062ea fix: normalize "bedrock" provider ID to "amazon-bedrock"
Add "bedrock" and "aws-bedrock" as aliases for the canonical
"amazon-bedrock" provider ID in normalizeProviderId().

Without this mapping, configuring a model as "bedrock/..." causes
the auth resolution fallback to miss the Bedrock-specific AWS SDK
path, since the fallback check requires normalized === "amazon-bedrock".
This primarily affects the main agent when the explicit auth override
is not preserved through config merging.

Fixes #15716
2026-02-24 23:57:11 +00:00
Peter Steinberger 53f9b7d4e7 fix(automation): harden announce delivery + cron coding profile (#25813 #25821 #25822)
Co-authored-by: Shawn <shenghuikevin@shenghuideMac-mini.local>
Co-authored-by: 不做了睡大觉 <user@example.com>
Co-authored-by: Marcus Widing <widing.marcus@gmail.com>
2026-02-24 23:49:34 +00:00
Brian Mendonca 48b052322b Security: sanitize inherited host exec env 2026-02-24 23:46:39 +00:00
Peter Steinberger 58309fd8d9 refactor(matrix,tests): extract helpers and inject send-queue timing 2026-02-24 23:37:50 +00:00
Peter Steinberger a2529c25ff test(matrix,discord,sandbox): expand breakage regression coverage 2026-02-24 23:37:50 +00:00
Peter Steinberger 13a1c46396 fix(web-search): reduce provider auto-detect log noise 2026-02-24 23:32:29 +00:00
Peter Steinberger 4355e08262 refactor: harden safe-bin trusted dir diagnostics 2026-02-24 23:29:44 +00:00
Peter Steinberger e7a5f9f4d8 fix(channels,sandbox): land hard breakage cluster from reviewed PR bases
Lands reviewed fixes based on #25839 (@pewallin), #25841 (@joshjhall), and #25737/@25713 (@DennisGoldfinger/@peteragility), with additional hardening + regression tests for queue cleanup and shell script safety.

Fixes #25836
Fixes #25840
Fixes #25824
Fixes #25868

Co-authored-by: Peter Wallin <pwallin@gmail.com>
Co-authored-by: Joshua Hall <josh@yaplabs.com>
Co-authored-by: Dennis Goldfinger <dennisgoldfinger@gmail.com>
Co-authored-by: peteragility <peteragility@users.noreply.github.com>
2026-02-24 23:27:56 +00:00
Peter Steinberger 5552f9073f refactor(sandbox): centralize network mode policy helpers 2026-02-24 23:26:46 +00:00
Peter Steinberger 14b6eea6e3 feat(sandbox): block container namespace joins by default 2026-02-24 23:20:34 +00:00
Peter Steinberger d3da67c7a9 fix(security): lock sandbox tmp media paths to openclaw roots 2026-02-24 23:10:19 +00:00
Peter Steinberger 9ef0fc2ff8 fix(sandbox): block @-prefixed workspace path bypass 2026-02-24 17:23:14 +00:00
Mariano Belinky 4ec0af00fe Agents: fix embedded auth-profile failure helper typing 2026-02-24 15:16:11 +00:00
Peter Steinberger 878b4e0ed7 refactor: unify tools.fs workspaceOnly resolution 2026-02-24 15:14:05 +00:00
Peter Steinberger 13bfe7faa6 refactor(sandbox): share bind parsing and host-path policy checks 2026-02-24 15:04:47 +00:00
Peter Steinberger 370d115549 fix: enforce workspaceOnly for native prompt image autoload 2026-02-24 14:47:59 +00:00
Peter Steinberger 6da03eabe2 fix: add changelog and clean regression comment for tool-result guard (#25429) (thanks @mikaeldiakhate-cell) 2026-02-24 14:42:09 +00:00
Leakim 8db7ca8c02 fix: prevent synthetic toolResult for aborted/errored assistant messages
When an assistant message with toolCalls has stopReason 'aborted' or 'error',
the guard should not add those tool call IDs to the pending map. Creating
synthetic tool results for incomplete/aborted tool calls causes API 400 errors:
'unexpected tool_use_id found in tool_result blocks'

This aligns the WRITE path (session-tool-result-guard.ts) with the READ path
(session-transcript-repair.ts) which already skips aborted messages.

Fixes: orphaned tool_result causing session corruption

Tests added:
- does NOT create synthetic toolResult for aborted assistant messages
- does NOT create synthetic toolResult for errored assistant messages
2026-02-24 14:42:09 +00:00
Elarwei aa2826b5b1 fix(usage): parse Kimi K2 cached_tokens from prompt_tokens_details
Kimi K2 models use automatic prefix caching and return cache stats in
a nested field: usage.prompt_tokens_details.cached_tokens

This fixes issue #7073 where cacheRead was showing 0 for K2.5 users.

Also adds cached_tokens (top-level) for moonshot-v1 explicit caching API.

Closes #7073
2026-02-24 14:40:52 +00:00
Peter Steinberger b5787e4abb fix(sandbox): harden bind validation for symlink missing-leaf paths 2026-02-24 14:37:35 +00:00
Peter Steinberger 39631639b7 fix: add changelog + typed omission test note (#25314) (thanks @lbo728) 2026-02-24 14:22:02 +00:00
lbo728 b863316e7b fix(models): preserve user reasoning override when merging with built-in catalog
When a built-in provider model has reasoning:true (e.g. MiniMax-M2.5) and
the user explicitly sets reasoning:false in their config, mergeProviderModels
unconditionally overwrote the user's value with the built-in catalog value.

The merge code refreshes capability metadata (input, contextWindow, maxTokens,
reasoning) from the implicit catalog. This is correct for fields like
contextWindow and maxTokens — the catalog has authoritative values that
shouldn't be stale. But reasoning is a user preference, not just a
capability descriptor: users may need to disable it to avoid 'Message
ordering conflict' errors with certain models or backends.

Fix: check whether 'reasoning' is present in the explicit (user-supplied)
model entry. If the user has set it (even to false), honour that value.
If the user hasn't set it, fall back to the built-in catalog default.

This allows users to configure tools.models.providers.minimax.models with
reasoning:false for MiniMax-M2.5 without being silently overridden.

Fixes #25244
2026-02-24 14:22:02 +00:00
SidQin-cyber 99d854db82 fix(agents): await block-reply flush before tool execution starts
handleToolExecutionStart() flushed pending block replies and then called
onBlockReplyFlush() as fire-and-forget (`void`). This created a race where
fast tool results (especially media on Telegram) could be delivered before
the text block that preceded the tool call.

Await onBlockReplyFlush() so the block pipeline finishes before tool
execution continues, preserving delivery order.

Fixes #25267

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 14:11:40 +00:00
Peter Steinberger d3ecc234da test: align flaky CI expectations after main changes (#24991) (thanks @stakeswky) 2026-02-24 04:34:49 +00:00
Peter Steinberger d427d09b5e fix: align reasoning payload typing for #24991 (thanks @stakeswky) 2026-02-24 04:34:49 +00:00
User 7d76c241f8 fix: suppress reasoning payloads from generic channel dispatch path
When reasoningLevel is 'on', reasoning content was being sent as a
visible message to WhatsApp and other non-Telegram channels via two
paths:
1. Block reply: emitted via onBlockReply in handleMessageEnd
2. Final payloads: added to replyItems in buildEmbeddedRunPayloads

Telegram has its own dispatch path (bot-message-dispatch.ts) that
splits reasoning into a dedicated lane and handles suppression.
The generic dispatch-from-config.ts path used by WhatsApp, web, etc.
had no such filtering.

Fix:
- Add isReasoning?: boolean flag to ReplyPayload
- Tag reasoning payloads at both emission points
- Filter isReasoning payloads in dispatch-from-config.ts for both
  block reply and final reply paths

Telegram is unaffected: it uses its own deliver callback that detects
reasoning via the 'Reasoning:\n' prefix and routes to a separate lane.

Fixes #24954
2026-02-24 04:34:49 +00:00
zerone0x bc52d4a459 fix(openrouter): skip reasoning effort injection for 'auto' routing model
The 'auto' model on OpenRouter dynamically routes to any underlying model
OpenRouter selects, including reasoning-required endpoints. Previously,
OpenClaw would unconditionally inject `reasoning.effort: "none"` into
every request when the thinking level was "off", which causes a 400 error
on models where reasoning is mandatory and cannot be disabled.

Root cause:
- openrouter/auto has reasoning: false in the built-in catalog
- With thinking level "off", createOpenRouterWrapper injects
  `reasoning: { effort: "none" }` via mapThinkingLevelToOpenRouterReasoningEffort
- For any OpenRouter-routed model that requires reasoning this results in:
  "400 Reasoning is mandatory for this endpoint and cannot be disabled"
- The reasoning: false is then persisted back to models.json on every
  ensureOpenClawModelsJson call, so manually removing it has no lasting effect

Fix:
- In applyExtraParamsToAgent, when provider is "openrouter" and the model
  id is "auto", pass undefined as thinkingLevel to createOpenRouterWrapper
  so no reasoning.effort is injected at all, letting OpenRouter's upstream
  model handle it natively
- Add an explanatory comment in buildOpenrouterProvider clarifying that the
  reasoning: false catalog value does NOT cause effort injection for "auto"

Users who need explicit reasoning control should target a specific model
id (e.g. openrouter/deepseek/deepseek-r1) rather than the auto router.

Fixes #24851

(cherry picked from commit aa55439798)
2026-02-24 04:33:51 +00:00
Ben Marvell eae13d9367 test(agents): update test to match universal tool-result repair for OpenAI
The previous test asserted that OpenAI-responses sessions would NOT get
synthetic tool results for orphaned tool calls. With repairToolUseResultPairing
now running universally, the correct behavior is that orphaned tool calls
get a synthetic tool_result — matching what OpenAI actually requires.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 2edb0ffe0b)
2026-02-24 04:33:51 +00:00
Ben Marvell 252079f001 fix(agents): repair orphaned tool results for OpenAI after history truncation
repairToolUseResultPairing was gated behind !isOpenAi, skipping orphaned
tool_result cleanup for OpenAI providers. When limitHistoryTurns truncated
conversation history, tool_result messages whose matching tool_call was
before the truncation point survived and were sent as function_call_output
items with stale call_id references. OpenAI rejects these with:
"No tool call found for function call output with call_id ..."

Enable the repair universally — all providers need it after truncation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 97b065aa6e)
2026-02-24 04:33:50 +00:00
Marc Gratch 75969ed5c4 fix(plugins): pass session context to before_compaction hook in subscribe handler
The handleAutoCompactionStart handler was calling runBeforeCompaction with
only messageCount and an empty hook context. Plugins receiving this hook
could not identify the session or snapshot the transcript during
auto-compaction.

The other call site in compact.ts already passes the full payload
(messages, sessionFile, sessionKey). This aligns the subscribe handler
to do the same using ctx.params.session and ctx.params.sessionKey.

(cherry picked from commit 318a19d1a1)
2026-02-24 04:33:50 +00:00
JackyWay 792bd6195c fix: recognize Bedrock as Anthropic-compatible in transcript policy
(cherry picked from commit 3b5154081c)
2026-02-24 04:33:50 +00:00
github-actions[bot] 3823587ada fix(agents): allow empty edit replacement text
(cherry picked from commit 3c21fc30d3)
2026-02-24 04:33:50 +00:00
Mitch McAlister 5710d72527 feat(agents): configurable default runTimeoutSeconds for subagent spawns
When sessions_spawn is called without runTimeoutSeconds, subagents
previously defaulted to 0 (no timeout). This adds a config key at
agents.defaults.subagents.runTimeoutSeconds so operators can set a
global default timeout for all subagent runs.

The agent-provided value still takes precedence when explicitly passed.
When neither the agent nor the config specifies a timeout, behavior is
unchanged (0 = no timeout), preserving backwards compatibility.

Updated for the subagent-spawn.ts refactor (logic moved from
sessions-spawn-tool.ts to spawnSubagentDirect).

Closes #19288

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 04:22:43 +00:00
zerone0x ac6cec7677 fix(providers): strip trailing /v1 from Anthropic baseUrl to prevent double-path
The pi-ai Anthropic provider constructs the full API endpoint as
`${baseUrl}/v1/messages`. If a user configures
`models.providers.anthropic.baseUrl` with a trailing `/v1`
(e.g. "https://api.anthropic.com/v1"), the resolved URL becomes
"https://api.anthropic.com/v1/v1/messages" which the Anthropic API
rejects with a 404 / connection failure.

This regression appeared in v2026.2.22 when @mariozechner/pi-ai bumped
from 0.54.0 to 0.54.1, which started appending the /v1 segment where
the previous version did not.

Fix: in normalizeModelCompat(), detect anthropic-messages models and
strip a single trailing /v1 (with optional trailing slash) from the
configured baseUrl before it is handed to pi-ai. Models with baseUrls
that do not end in /v1 are unaffected. Non-anthropic-messages models
are not touched.

Adds 6 unit tests covering the normalisation scenarios.

Fixes #24709

(cherry picked from commit 4c4857fdcb)
2026-02-24 04:20:30 +00:00
User c7bf0dacb8 chore: remove unused isMinimal param from buildSkillsSection
Address review feedback: isMinimal is no longer referenced after the
early-return guard was removed in the parent commit.

(cherry picked from commit 2efe04d301)
2026-02-24 04:20:30 +00:00
User 2398b51378 fix: include available_skills in isolated cron agentTurn sessions (closes #24888)
buildSkillsSection() had an early-return guard on isMinimal that silently
dropped the entire <available_skills> block for any session using
promptMode="minimal" — which includes all isolated cron agentTurn sessions
(isCronSessionKey → promptMode="minimal" in attempt.ts:497-500).

Fix: remove the isMinimal guard from buildSkillsSection so that skills are
emitted whenever a non-empty skillsPrompt is provided, regardless of mode.
Memory, docs, reply-tags, and other verbose sections remain gated on isMinimal.

Tests added:
- "includes skills in minimal prompt mode when skillsPrompt is provided (cron regression)"
- "omits skills in minimal prompt mode when skillsPrompt is absent"
- Updated existing minimal-mode test expectation to match corrected behaviour.

(cherry picked from commit 66af86e7ee)
2026-02-24 04:20:30 +00:00
SidQin-cyber f3459d71e8 fix(exec): treat shell exit codes 126/127 as failures instead of completed
When a command exits with code 127 (command not found) or 126 (not
executable), the exec tool previously returned status "completed" with
the error buried in the output text. This caused cron jobs to report
status "ok" and never increment consecutiveErrors, silently swallowing
failures like `python: command not found` across multiple daily cycles.

Now these shell-reserved exit codes are classified as "failed", which
propagates through the cron pipeline to properly increment
consecutiveErrors and surface the issue for operator attention.

Fixes #24587

Co-authored-by: Cursor <cursoragent@cursor.com>
(cherry picked from commit 2b1d1985ef)
2026-02-24 04:20:30 +00:00
HeMuling 3c13f4c2b4 test(subagents): mock sessions store in steer-restart coverage 2026-02-24 04:17:56 +00:00