Commit Graph

2647 Commits

Author SHA1 Message Date
Peter Steinberger e578521ef4 fix(security): harden session export image data-url handling 2026-02-24 02:53:39 +00:00
Gustavo Madeira Santana 4663d68384 Tests: make model-catalog fixtures type-valid 2026-02-23 21:36:34 -05:00
Peter Steinberger ce02ad9643 refactor(agents): centralize sandbox media and fs policy helpers 2026-02-24 02:32:01 +00:00
Peter Steinberger 6634030be3 fix: enforce apply_patch workspaceOnly in sandbox mounts 2026-02-24 02:23:56 +00:00
Peter Steinberger c070be1bc4 fix(sandbox): harden fs bridge path checks and bind mount policy 2026-02-24 02:21:43 +00:00
Peter Steinberger dd9d9c1c60 fix(security): enforce workspaceOnly for sandbox image tool 2026-02-24 02:17:55 +00:00
Gustavo Madeira Santana 5239b55c0a
Config: expand Kilo catalog and persist selected Kilo models (#24921)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: f5a7e1a385
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-23 21:17:37 -05:00
Peter Steinberger a1c4bf07c6 fix(security): harden exec wrapper allowlist execution parity 2026-02-24 01:52:17 +00:00
Peter Steinberger 5eb72ab769 fix(security): harden browser SSRF defaults and migrate legacy key 2026-02-24 01:52:01 +00:00
Peter Steinberger 8779b523dc test(sandbox): speed up agent-config coverage with pure resolvers 2026-02-24 01:46:12 +00:00
Peter Steinberger 467666adc7 test(sandbox): use focused modules in lightweight suites 2026-02-24 01:46:12 +00:00
Peter Steinberger e5931554bf test: tighten slow test timeouts and cleanup 2026-02-24 01:16:53 +00:00
Peter Steinberger 6c43d0a08e test(gateway): move sessions_send error paths to unit tests 2026-02-24 01:16:53 +00:00
Peter Steinberger 12cc754332 fix(acp): harden permission auto-approval policy 2026-02-24 01:03:30 +00:00
Vincent Koc 30c622554f
Providers: disable developer role for DashScope-compatible endpoints (#24675)
* Agents: disable developer role for DashScope-compatible endpoints

* Agents: test DashScope developer-role compatibility

* Gateway: test allowlisted sessions.patch model selection

* Changelog: add DashScope role-compat fix note
2026-02-23 19:51:16 -05:00
Peter Steinberger 8dfa33d373 test(sandbox): add root bind mount regression 2026-02-24 00:17:21 +00:00
Peter Steinberger e6484cb65f refactor: harden kilocode auth ordering and dedupe provider wiring 2026-02-23 23:37:13 +00:00
John Fawcett 13f32e2f7d
feat: Add Kilo Gateway provider (#20212)
* feat: Add Kilo Gateway provider

Add support for Kilo Gateway as a model provider, similar to OpenRouter.
Kilo Gateway provides a unified API that routes requests to many models
behind a single endpoint and API key.

Changes:
- Add kilocode provider option to auth-choice and onboarding flows
- Add KILOCODE_API_KEY environment variable support
- Add kilocode/ model prefix handling in model-auth and extra-params
- Add provider documentation in docs/providers/kilocode.md
- Update model-providers.md with Kilo Gateway section
- Add design doc for the integration

* kilocode: add provider tests and normalize onboard auth-choice registration

* kilocode: register in resolveImplicitProviders so models appear in provider filter

* kilocode: update base URL from /api/openrouter/ to /api/gateway/

* docs: fix formatting in kilocode docs

* fix: address PR review — remove kilocode from cacheRetention, fix stale model refs and CLI name in docs, fix TS2742

* docs: fix stale refs in design doc — Moltbot to OpenClaw, MoltbotConfig to OpenClawConfig, remove extra-params section, fix doc path

* fix: use resolveAgentModelPrimaryValue for AgentModelConfig union type

---------

Co-authored-by: Mark IJbema <mark@kilocode.ai>
2026-02-23 23:29:27 +00:00
Peter Steinberger c248c515a3 test: collapse sandbox agent config duplicate cases 2026-02-23 22:01:32 +00:00
Peter Steinberger 287586206c test: consolidate sandbox docker merge scenarios 2026-02-23 22:01:22 +00:00
Peter Steinberger 0ae7f470a2 test: normalize skill prompt path assertions on windows 2026-02-23 21:17:29 +00:00
Peter Steinberger 426f803b8a test: speed up sessions_spawn tool harness 2026-02-23 21:13:05 +00:00
Peter Steinberger 7e5f771d27 test: speed up skills test suites 2026-02-23 21:02:13 +00:00
Peter Steinberger 75423a00d6 refactor: deduplicate shared helpers and test setup 2026-02-23 20:40:44 +00:00
Peter Steinberger 1f5e6444ee test: remove redundant pi embedded runner cases 2026-02-23 20:15:56 +00:00
Peter Steinberger 63e4dfaa9c test: consolidate pi-tools gating assertions 2026-02-23 20:00:11 +00:00
Peter Steinberger 87603b5c45 fix: sync built-in channel enablement across config paths 2026-02-23 19:40:42 +00:00
Peter Steinberger fe62711342 test(gate): stabilize env- and timing-sensitive process/web-search checks 2026-02-23 19:19:58 +00:00
Peter Steinberger cf38339f25 fix(tools): improve session_status cache-aware usage reporting
Co-authored-by: Lucian Feraru <1ucian@users.noreply.github.com>
2026-02-23 19:19:45 +00:00
Peter Steinberger 40db3fef49 fix(agents): cache bootstrap snapshots per session key
Co-authored-by: Isis Anisoptera <github@lotuswind.net>
2026-02-23 19:19:45 +00:00
Peter Steinberger 47723b646d refactor(test): de-duplicate msteams and bash test helpers 2026-02-23 19:12:27 +00:00
Peter Steinberger 160bd61fff feat(agents): add per-agent stream params overrides for cache tuning (#17470) (thanks @rrenamed) 2026-02-23 18:46:40 +00:00
Peter Steinberger be6f0b8c84 fix(providers): support Bedrock Anthropic cacheRetention defaults/pass-through (#22303) (thanks @snese) 2026-02-23 18:46:40 +00:00
Peter Steinberger ca5c0bc02b fix(providers): disable Bedrock prompt caching for non-Anthropic models (#20866) (thanks @pierreeurope) 2026-02-23 18:46:40 +00:00
Peter Steinberger ff0c40d367 test(tools): fix kimi web_search mock typing 2026-02-23 18:27:37 +00:00
Peter Steinberger e02c470d5e feat(tools): add kimi web_search provider
Co-authored-by: adshine <adshine@users.noreply.github.com>
2026-02-23 18:27:37 +00:00
Peter Steinberger f93ca93498 fix(agents): extend cache-ttl eligibility for moonshot and zai
Co-authored-by: lailoo <lailoo@users.noreply.github.com>
2026-02-23 18:27:36 +00:00
Peter Steinberger 2fa6aa6ea6 test(agents): add comprehensive kimi regressions 2026-02-23 18:27:36 +00:00
Peter Steinberger cc7a498ace refactor(tests): deduplicate repeated fixtures in msteams and bash tests 2026-02-23 17:59:56 +00:00
Vincent Koc f03ff39754
Providers: skip context1m beta for Anthropic OAuth tokens (#24620)
* Providers: skip context1m beta for Anthropic OAuth tokens

* Tests: cover OAuth context1m beta skip behavior

* Docs: note context1m OAuth incompatibility

* Agents: add context1m-aware context token resolver

* Agents: cover context1m context-token resolver

* Commands: apply context1m-aware context tokens in session store

* Commands: apply context1m-aware context tokens in status summary

* Status: resolve context tokens with context1m model params

* Status: test context1m status context display
2026-02-23 12:29:09 -05:00
Peter Steinberger a8a4fa5b88 test: de-duplicate attachment and bash tool tests 2026-02-23 17:19:34 +00:00
Sally O'Malley eb4ff6df81
Allow Claude model requests to route through Google Vertex AI (#23985)
* feat: add anthropic-vertex provider for Claude via GCP Vertex AI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* docs: add anthropic-vertex provider guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* Agents: validate Anthropic Vertex project env

* Changelog: format update for Vertex entry

* Providers: rename Anthropic Vertex to Google Vertex Claude

* Providers: remove Vertex Claude provider path

* Models: normalize Vercel Claude shorthand refs

* Onboarding: default Vercel model to Claude shorthand

* Changelog: add @vincentkoc credit for #23985

* Onboarding: keep canonical Vercel default model ref

* Tests: expand Vercel model normalization coverage

---------

Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 11:04:31 -05:00
Clawborn 544809b6f6
Add Chinese context overflow patterns to isContextOverflowError (#22855)
Proxy providers returning Chinese error messages (e.g. Chinese LLM
gateways) use patterns like '上下文过长' or '上下文超出' that are not
matched by the existing English-only patterns in isContextOverflowError.
This prevents auto-compaction from triggering, leaving the session stuck.

Add the most common Chinese proxy patterns:
- 上下文过长 (context too long)
- 上下文超出 (context exceeded)
- 上下文长度超 (context length exceeds)
- 超出最大上下文 (exceeds maximum context)
- 请压缩上下文 (please compress context)

Chinese characters are unaffected by toLowerCase() so check the
original message directly.

Closes #22849
2026-02-23 10:54:24 -05:00
Vincent Koc 4f340b8812
fix(agents): avoid classifying reasoning-required errors as context overflow (#24593)
* Agents: exclude reasoning-required errors from overflow detection

* Tests: cover reasoning-required overflow classification guard

* Tests: format reasoning-required endpoint errors
2026-02-23 10:38:49 -05:00
Alice Losasso 652099cd5c
fix: correctly identify Groq TPM limits as rate limits instead of context overflow (#16176)
Co-authored-by: Howard <dddabtc@users.noreply.github.com>
2026-02-23 10:32:53 -05:00
LI SHANXIN c1b75ab8e2
fix(telegram): make reaction handling soft-fail and message-id resilient (#20236)
* Telegram: soft-fail reactions and fallback to inbound message id

* Telegram: soft-fail missing reaction message id

* Update CHANGELOG.md

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 10:25:14 -05:00
DukeDeSouth ea47ab29bd
fix: cancel compaction instead of truncating history when summarization fails (#10711)
* fix: cancel compaction instead of truncating history when summarization fails

When the compaction safeguard cannot generate a summary (no model, no API
key, or LLM error), it previously returned a "Summary unavailable" fallback
string and still truncated history. This caused irreversible data loss -
older messages were discarded even though no meaningful summary was produced.

Now returns `{ cancel: true }` in all three failure paths so the framework
aborts compaction entirely and preserves the full conversation history.

Fixes #10332

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: use deterministic timestamps in compaction safeguard tests

Replace Date.now() with fixed timestamp (0) in test data to prevent
nondeterministic behavior in snapshot-based or order-dependent tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Changelog: note compaction cancellation safeguard fix

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 10:23:13 -05:00
Owen 01380f49f5
fix(compaction): pass model through runtime for safeguard summaries (#17864)
* fix(compaction): pass model through runtime to fix ctx.model undefined

Fixes #3479

Root cause: extensionRunner.initialize() is never called in compact.ts workflow,
leaving ctx.model undefined. Compaction safeguard checks ctx.model and returns
fallback summary immediately without attempting LLM summarization.

Changes:
1. Pass model through compaction safeguard runtime registry (same pattern as maxHistoryShare)
2. Fall back to runtime.model when ctx.model is undefined
3. Add once-per-session warning when both models are missing (prevents log spam)
4. Add regression test for runtime.model fallback

This follows the established runtime registry pattern rather than attempting to call
extensionRunner.initialize() (which is SDK-internal and not meant for direct access).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: add comprehensive tests for compaction-safeguard model fallback

Add integration tests to verify the model fallback behavior:
- Test runtime.model fallback when ctx.model is undefined (compact.ts workflow)
- Test fallback summary when both ctx.model and runtime.model are undefined
- Test contextWindowTokens runtime storage/retrieval
- Test combined runtime values (maxHistoryShare + contextWindowTokens + model)

These tests verify the fix for issue #3479 where compaction fails due to
ctx.model being undefined in the compact.ts workflow. The runtime registry
pattern allows model to be passed when extensionRunner.initialize() is not
called, ensuring summarization works in all code paths.

Related: PR #17864

* fix(test): adapt compaction-safeguard tests to upstream type changes

- Add baseUrl to Model mock objects (now required by Model<Api>)
- Add explicit Model<Api> annotation to prevent provider string widening
- Cast modelRegistry mock through unknown (ModelRegistry expanded)
- Use non-null assertion for compactionHandler (TypeScript strict)
- Type compaction result explicitly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Compaction: add changelog credit for model fallback fix

* Update CHANGELOG.md

* Update CHANGELOG.md

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 10:14:21 -05:00
青雲 69692d0d3a
fix: detect additional context overflow error patterns to prevent leak to user (#20539)
* fix: detect additional context overflow error patterns to prevent leak to user

Fixes #9951

The error 'input length and max_tokens exceed context limit: 170636 +
34048 > 200000' was not caught by isContextOverflowError() and leaked
to users via formatAssistantErrorText()'s invalidRequest fallback.

Add three new patterns to isContextOverflowError():
- 'exceed context limit' (direct match)
- 'exceeds the model\'s maximum context'
- max_tokens/input length + exceed + context (compound match)

These are now rewritten to the friendly context overflow message.

* Overflow: add regression tests and changelog credits

* Update CHANGELOG.md

* Update pi-embedded-helpers.isbillingerrormessage.test.ts

---------

Co-authored-by: echoVic <AkiraVic@outlook.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 10:03:56 -05:00
AkosCz 3a3c2da916
[Feature]: Add Gemini (Google Search grounding) as web_search provider (#13075)
* feat: add Gemini (Google Search grounding) as web_search provider

Add Gemini as a fourth web search provider alongside Brave, Perplexity,
and Grok. Uses Gemini's built-in Google Search grounding tool to return
search results with citations.

- Add runGeminiSearch() with Google Search grounding via tools API
- Resolve Gemini's grounding redirect URLs to direct URLs via parallel
  HEAD requests (5s timeout, graceful fallback)
- Add Gemini config block (apiKey, model) with env var fallback
- Default model: gemini-2.5-flash (fast, cheap, grounding-capable)
- Strip API key from error messages for security
- Add config validation tests for Gemini provider
- Update docs/tools/web.md with Gemini provider documentation

Closes #13074

* feat: auto-detect search provider from available API keys

When no explicit provider is configured, resolveSearchProvider now
checks for available API keys in priority order (Brave → Gemini →
Perplexity → Grok) and selects the first provider with a valid key.

- Add auto-detection logic using existing resolve*ApiKey functions
- Export resolveSearchProvider via __testing_provider for tests
- Add 8 tests covering auto-detection, priority order, and explicit override
- Update docs/tools/web.md with auto-detection documentation

* fix: merge __testing exports, downgrade auto-detect log to debug

* fix: use defaultRuntime.log instead of .debug (not in RuntimeEnv type)

* fix: mark gemini apiKey as sensitive in zod schema

* fix: address Greptile review — add externalContent to Gemini payload, add Gemini/Grok entries to schema labels/help, remove dead schema-fields.ts

* fix(web-search): add JSON parse guard for Gemini API responses

Addresses Greptile review comment: add try/catch to handle non-JSON
responses from Gemini API gracefully, preventing runtime errors on
malformed responses.

Note: FIELD_HELP entries for gemini.apiKey and gemini.model were
already present in schema.help.ts, and gemini.apiKey was already
marked as sensitive in zod-schema.agent-runtime.ts (both fixed in
earlier commits).

* fix: use structured readResponseText result in Gemini error path

readResponseText returns { text, truncated, bytesRead }, not a string.
The Gemini error handler was using the result object directly, which
would always be truthy and never fall through to res.statusText.
Align with Perplexity/xAI/Brave error patterns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix import order and formatting after rebase onto main

* Web search: send Gemini API key via header

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 09:30:51 -05:00