* fix: stabilize telegram draft stream message boundaries
* fix: suppress NO_REPLY lead-fragment leaks
* fix: keep underscore guard for non-NO_REPLY prefixes
* fix: skip assistant-start rotation only after real lane rotation
* fix: preserve finalized state when pre-rotation does not force
* fix: reset finalized preview state on message-start boundary
* fix: document Telegram draft boundary + NO_REPLY reliability updates (#33169) (thanks @obviyus)
* fix(telegram): prevent duplicate messages in DM draft streaming mode
When using sendMessageDraft for DM streaming (streaming: 'partial'),
the draft bubble auto-converts to the final message. The code was
incorrectly falling through to sendPayload() after the draft was
finalized, causing a duplicate message.
This fix checks if we're in draft preview mode with hasStreamedMessage
and skips the sendPayload call, returning "preview-finalized" directly.
Key changes:
- Use hasStreamedMessage flag instead of previewRevision comparison
- Avoids double stopDraftLane calls by returning early
- Prevents duplicate messages when final text equals last streamed text
Root cause: In lane-delivery.ts, the final message handling logic
did not properly handle the DM draft flow where sendMessageDraft
creates a transient bubble that doesn't need a separate final send.
* fix(telegram): harden DM draft finalization path
* fix(telegram): require emitted draft preview for unchanged finals
* fix(telegram): require final draft text emission before finalize
* fix: update changelog for telegram draft finalization (#32118) (thanks @OpenCils)
---------
Co-authored-by: Ayaan Zaidi <zaidi@uplause.io>
fix: improve compaction summary instructions to preserve active work
Expand staged-summary merge instructions to preserve active task status, batch progress, latest user request, and follow-up commitments so compaction handoffs retain in-flight work context.
Co-authored-by: joetomasone <56984887+joetomasone@users.noreply.github.com>
Co-authored-by: Josh Lehman <josh@martian.engineering>
Complete the stop reason propagation chain so ACP clients can
distinguish end_turn from max_tokens:
- server-chat.ts: emitChatFinal accepts optional stopReason param,
includes it in the final payload, reads it from lifecycle event data
- translator.ts: read stopReason from the final payload instead of
hardcoding end_turn
Chain: LLM API → run.ts (meta.stopReason) → agent.ts (lifecycle event)
→ server-chat.ts (final payload) → ACP translator (PromptResponse)
* fix(gateway): flush throttled delta before emitChatFinal
The 150ms throttle in emitChatDelta can suppress the last text chunk
before emitChatFinal fires, causing streaming clients (e.g. ACP) to
receive truncated responses. The final event carries the complete text,
but clients that build responses incrementally from deltas miss the
tail end.
Flush one last unthrottled delta with the complete buffered text
immediately before sending the final event. This ensures all streaming
consumers have the full response without needing to reconcile deltas
against the final payload.
* fix(gateway): avoid duplicate delta flush when buffer unchanged
Track the text length at the time of the last broadcast. The flush in
emitChatFinal now only sends a delta if the buffer has grown since the
last broadcast, preventing duplicate sends when the final delta passed
the 150ms throttle and was already broadcast.
* fix(gateway): honor heartbeat suppression in final delta flush
* test(gateway): add final delta flush and dedupe coverage
* fix(gateway): skip final flush for silent lead fragments
* docs(changelog): note gateway final-delta flush fix credits
---------
Co-authored-by: Jonathan Taylor <visionik@pobox.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>