Commit Graph

26 Commits

Author SHA1 Message Date
Peter Steinberger a48a3dbdda refactor(tests): dedupe tool, projector, and delivery fixtures 2026-03-03 01:06:00 +00:00
Peter Steinberger a6a742f3d0 fix(auto-reply): land #31080 from @scoootscooob
Co-authored-by: scoootscooob <zhentongfan@gmail.com>
2026-03-02 01:17:42 +00:00
Peter Steinberger 37a138c554 fix: harden typing lifecycle and cross-channel suppression 2026-02-26 17:01:09 +01:00
Peter Steinberger 8bf1c9a23a fix(typing): stop keepalive restarts after run completion (land #27413, thanks @widingmarcus-cyber)
Co-authored-by: Marcus Widing <widing.marcus@gmail.com>
2026-02-26 11:42:38 +00:00
Peter Steinberger 45b5c35b21 test: fix CI failures in heartbeat and typing tests 2026-02-25 02:28:42 +00:00
Peter Steinberger 52ddb6ae18 test: streamline auto-reply and tts suites 2026-02-21 21:44:01 +00:00
Tyler Yust 087dca8fa9
fix(subagent): harden read-tool overflow guards and sticky reply threading (#19508)
* fix(gateway): avoid premature agent.wait completion on transient errors

* fix(agent): preemptively guard tool results against context overflow

* fix: harden tool-result context guard and add message_id metadata

* fix: use importOriginal in session-key mock to include DEFAULT_ACCOUNT_ID

The run.skill-filter test was mocking ../../routing/session-key.js with only
buildAgentMainSessionKey and normalizeAgentId, but the module also exports
DEFAULT_ACCOUNT_ID which is required transitively by src/web/auth-store.ts.

Switch to importOriginal pattern so all real exports are preserved alongside
the mocked functions.

* pi-runner: guard accumulated tool-result overflow in transformContext

* PI runner: compact overflowing tool-result context

* Subagent: harden tool-result context recovery

* Enhance tool-result context handling by adding support for legacy tool outputs and improving character estimation for message truncation. This includes a new function to create legacy tool results and updates to existing functions to better manage context overflow scenarios.

* Enhance iMessage handling by adding reply tag support in send functions and tests. This includes modifications to prepend or rewrite reply tags based on provided replyToId, ensuring proper message formatting for replies.

* Enhance message delivery across multiple channels by implementing sticky reply context for chunked messages. This includes preserving reply references in Discord, Telegram, and iMessage, ensuring that follow-up messages maintain their intended reply targets. Additionally, improve handling of reply tags in system prompts and tests to support consistent reply behavior.

* Enhance read tool functionality by implementing auto-paging across chunks when no explicit limit is provided, scaling output budget based on model context window. Additionally, add tests for adaptive reading behavior and capped continuation guidance for large outputs. Update related functions to support these features.

* Refine tool-result context management by stripping oversized read-tool details payloads during compaction, ensuring repeated read calls do not bypass context limits. Introduce new utility functions for handling truncation content and enhance character estimation for tool results. Add tests to validate the removal of excessive details in context overflow scenarios.

* Refine message delivery logic in Matrix and Telegram by introducing a flag to track if a text chunk was sent. This ensures that replies are only marked as delivered when a text chunk has been successfully sent, improving the accuracy of reply handling in both channels.

* fix: tighten reply threading coverage and prep fixes (#19508) (thanks @tyler6204)
2026-02-17 15:32:52 -08:00
cpojer 7d2ef131c1
chore: Fix types in tests 42/N. 2026-02-17 15:50:07 +09:00
Peter Steinberger d841c9b26b test: remove duplicate replyToTag assertion in split-tag case 2026-02-16 10:02:59 +00:00
Peter Steinberger 597f956a4f test: remove duplicate existing-id all-mode planner case 2026-02-16 10:01:58 +00:00
Peter Steinberger f043f2d8c9 test: trim duplicate first-mode hasReplied assertion variant 2026-02-16 10:00:57 +00:00
Peter Steinberger a4e7f256db test: drop redundant off-mode hasReplied assertion 2026-02-16 09:59:59 +00:00
Peter Steinberger 893f56b87d test: remove redundant multi-variable template resolution case 2026-02-16 09:59:09 +00:00
Peter Steinberger 4da68afc73 test: remove duplicate off-mode existing-id planner case 2026-02-16 09:58:05 +00:00
Peter Steinberger 7cfd0aed5f test: remove duplicate non-date negative-case assertion 2026-02-16 09:56:46 +00:00
Peter Steinberger d611db8049 test: remove duplicate provider-prefix assertion variant 2026-02-16 09:55:44 +00:00
Peter Steinberger 3eb9c2105c test: remove duplicate date-suffix assertion variant 2026-02-16 09:54:56 +00:00
Peter Steinberger 9f6462bd56 test: trim duplicate latest-suffix assertion variant 2026-02-16 09:54:05 +00:00
Peter Steinberger 2d03473072 test: trim duplicate provider-prefix assertion in short-model tests 2026-02-16 09:52:16 +00:00
Peter Steinberger dbcdcc5d19 test: remove duplicate positive template-variable detection case 2026-02-16 09:51:09 +00:00
Peter Steinberger c4297a8d60 test: remove redundant no-provider short-model case 2026-02-16 09:49:58 +00:00
Peter Steinberger deef9f91bf test: remove duplicate multi-variable template check case 2026-02-16 09:48:51 +00:00
Peter Steinberger 523193a91f test: remove duplicate static template-variable false case 2026-02-16 09:47:45 +00:00
Peter Steinberger cd04385f9f test: remove redundant provider-plus-date model-name case 2026-02-16 09:46:44 +00:00
Peter Steinberger 82fa526bb0 test: remove duplicate undefined template-variable guard case 2026-02-16 09:45:51 +00:00
Peter Steinberger d75cd40787 perf(test): consolidate reply utility suites 2026-02-15 23:14:42 +00:00