openclaw/src/telegram
Divanoli Mydeen Pitchai 1055e71c4b
fix(telegram): auto-wrap .md file references in backticks to prevent URL previews (#8649)
* fix(telegram): auto-wrap file references with TLD extensions to prevent URL previews

Telegram's auto-linker aggressively treats filenames like HEARTBEAT.md,
README.md, main.go, script.py as URLs and generates domain registrar previews.

This fix adds comprehensive protection for file extensions that share TLDs:
- High priority: .md, .go, .py, .pl, .ai, .sh
- Medium priority: .io, .tv, .fm, .am, .at, .be, .cc, .co

Implementation:
- Added wrapFileReferencesInHtml() in format.ts
- Runs AFTER markdown→HTML conversion
- Tokenizes HTML to respect tag boundaries
- Skips content inside <code>, <pre>, <a> tags (no nesting issues)
- Applied to all rendering paths: renderTelegramHtmlText, markdownToTelegramHtml,
  markdownToTelegramChunks, and delivery.ts fallback

Addresses review comments:
- P1: Now handles chunked rendering paths correctly
- P2: No longer wraps inside existing code blocks (token-based parsing)
- No lookbehinds used (broad Node compatibility)

Includes comprehensive test suite in format.wrap-md.test.ts

AI-assisted: true

* fix(telegram): prevent URL previews for file refs with TLD extensions

Two layers were causing spurious link previews for file references like
`README.md`, `backup.sh`, `main.go`:

1. **markdown-it linkify** converts `README.md` to
   `<a href="http://README.md">README.md</a>` (.md = Moldova TLD)
2. **Telegram auto-linker** treats remaining bare text as URLs

## Changes

### Primary fix: suppress auto-linkified file refs in buildTelegramLink
- Added `isAutoLinkedFileRef()` helper that detects when linkify auto-
  generated a link from a bare filename (href = "http://" + label)
- Rejects paths with domain-like segments (dots in non-final path parts)
- Modified `buildTelegramLink()` to return null for these, so file refs
  stay as plain text and get wrapped in `<code>` by the wrapper

### Safety-net: de-linkify in wrapFileReferencesInHtml
- Added pre-pass that catches auto-linkified anchors in pre-rendered HTML
- Handles edge cases where HTML is passed directly (textMode: "html")
- Reuses `isAutoLinkedFileRef()` logic — no duplication

### Bug fixes discovered during review
- **Fixed `isClosing` bug (line 169)**: the check `match[1] === "/"`
  was wrong — the regex `(<\/?)}` captures `<` or `</`, so closing
  tags were never detected. Changed to `match[1] === "</"`. This was
  causing `inCode/inPre/inAnchor` to stay stuck at true after any
  opening tag, breaking file ref wrapping after closing tags.
- **Removed double `wrapFileReferencesInHtml` call**: `renderTelegramHtmlText`
  was calling `markdownToTelegramHtml` (which wraps) then wrapping again.

### Test coverage (+12 tests, 26 total)
- `.sh` filenames (original issue #6932 mentioned backup.sh)
- Auto-linkified anchor replacement
- Auto-linkified path anchor replacement
- Explicit link preservation (different label)
- File ref after closing anchor tag (exercises isClosing fix)
- Multiple file types in single message
- Real URL preservation
- Explicit markdown link preservation
- File ref after real URL in same message
- Chunked output file ref wrapping

Closes #6932

* test(telegram): add comprehensive edge case coverage for file ref wrapping

Add 16 edge case tests covering:
- File refs inside bold/italic tags
- Fenced code blocks (no double-wrap)
- Domain-like paths preserved as links (example.com/README.md)
- GitHub URLs with file paths
- wrapFileRefs: false behavior
- All TLD extensions (.ai, .io, .tv, .fm)
- Non-TLD extensions not wrapped (.png, .css, .js)
- File ref position (start, end, multiple in sequence)
- Nested paths without domain segments
- Version-like paths (v1.0/README.md wraps, example.com/v1.0/README.md links)
- Hyphens and underscores in filenames
- Uppercase extensions

* fix(telegram): use regex literal and depth counters for tag tracking

Code review fixes:
1. Replace RegExp constructor with regex literal for autoLinkedAnchor
   - Avoids double-escaping issues with \s
   - Uses backreference \1 to match href=label pattern directly

2. Replace boolean toggles with depth counters for tag nesting
   - codeDepth, preDepth, anchorDepth track nesting levels
   - Correctly handles nested tags like <pre><code>...</code></pre>
   - Prevents wrapping inside any level of protected tags

Add 4 tests for edge cases:
- Nested code tags (depth tracking)
- Multiple anchor tags in sequence
- Auto-linked anchor with backreference match
- Anchor with different href/label (no match)

* fix(telegram): add escapeHtml and escapeRegex for defense in depth

Code review fixes:
1. Escape filename with escapeHtml() before inserting into <code> tags
   - Prevents HTML injection if regex ever matches unsafe chars
   - Defense in depth (current regex already limits to safe chars)

2. Escape extensions with escapeRegex() before joining into pattern
   - Prevents regex breakage if extensions contain metacharacters
   - Future-proofs against extensions like 'c++' or 'd.ts'

Add tests documenting regex safety boundaries:
- Filenames with special chars (&, <, >) don't match
- Only [a-zA-Z0-9_.\-./] chars are captured

* fix(telegram): catch orphaned single-letter TLD patterns

When text like 'R&D.md' doesn't match the main file pattern (because &
breaks the character class), the 'D.md' part can still be auto-linked
by Telegram as a domain (https://d.md/).

Add second pass to catch orphaned TLD patterns like 'D.md', 'R.io', 'X.ai'
that follow non-alphanumeric characters and wrap them in <code> tags.

Pattern: ([^a-zA-Z0-9]|^)([A-Za-z]\.(?:extensions))(?=[^a-zA-Z0-9/]|$)

Tests added:
- 'wraps orphaned TLD pattern after special character' (R&D.md → R&<code>D.md</code>)
- 'wraps orphaned single-letter TLD patterns' (X.ai, R.io)

* refactor(telegram): remove popular domain TLDs from file extension list

Remove .ai, .io, .tv, .fm from FILE_EXTENSIONS_WITH_TLD because:
- These are commonly used as real domains (x.ai, vercel.io, github.io)
- Rarely used as actual file extensions
- Users are more likely referring to websites than files

Keep: md, sh, py, go, pl (common file extensions, rarely intentional domains)
Keep: am, at, be, cc, co (less common as intentional domain references)

Update tests to reflect the change:
- Add test for supported extensions (.am, .at, .be, .cc, .co)
- Add test verifying popular TLDs stay as links

* fix(telegram): prevent orphaned TLD wrapping inside HTML tags

Code review fixes:

1. Orphaned TLD pass now checks if match is inside HTML tag
   - Uses lastIndexOf('<') vs lastIndexOf('>') to detect tag context
   - Skips wrapping when between < and > (inside attributes)
   - Prevents invalid HTML like <a href="...&<code>D.md</code>">

2. textMode: 'html' now trusts caller markup
   - Returns text unchanged instead of wrapping
   - Caller owns HTML structure in this mode

Tests added:
- 'does not wrap orphaned TLD inside href attributes'
- 'does not wrap orphaned TLD inside any HTML attribute'
- 'does not wrap in HTML mode (trusts caller markup)'

* refactor(telegram): use snapshot for orphaned TLD offset clarity

Use explicit snapshot variable when checking tag positions in orphaned
TLD pass. While JavaScript's replace() doesn't mutate during iteration,
this makes intent explicit and adds test coverage for multi-TLD HTML.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(telegram): prevent orphaned TLD wrapping inside code/pre tags

- Add depth tracking for code/pre tags in orphaned TLD pass
- Fix test to expect valid HTML output
- 55 tests now covering nested tag scenarios

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(telegram): clamp depth counters and add anchor tracking to orphaned pass

- Clamp depth counters at 0 for malformed HTML with stray closing tags
- Add anchor depth tracking to orphaned TLD pass to prevent wrapping
  inside link text (e.g., <a href="...">R&D.md</a>)
- 57 tests covering all edge cases

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(telegram): keep .co domains linked and wrap punctuated file refs

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-14 00:51:47 +01:00
..
bot fix(telegram): auto-wrap .md file references in backticks to prevent URL previews (#8649) 2026-02-14 00:51:47 +01:00
accounts.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
accounts.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
allowed-updates.ts fix: refine telegram reactions (#964) (thanks @bohdanpodvirnyi) 2026-01-15 17:20:17 +00:00
api-logging.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
audit.test.ts chore: migrate to oxlint and oxfmt 2026-01-14 15:02:19 +00:00
audit.ts refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
bot-access.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
bot-handlers.ts perf: speed up telegram media e2e flush timing 2026-02-13 19:52:45 +00:00
bot-message-context.dm-threads.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
bot-message-context.dm-topic-threadid.test.ts test(telegram): add DM topic threadId deliveryContext test for #8891 2026-02-05 15:33:30 +05:30
bot-message-context.sender-prefix.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
bot-message-context.ts fix: fix: transcribe audio before mention check in groups with requireMention (openclaw#9973) thanks @mcinteerj 2026-02-12 09:58:01 -06:00
bot-message-dispatch.test.ts feat: per-channel responsePrefix override (#9001) 2026-02-04 16:16:34 -05:00
bot-message-dispatch.ts Telegram: remove @ts-nocheck from bot.ts, fix duplicate error handler, harden sticker caching (#9077) 2026-02-04 22:35:51 +00:00
bot-message.test.ts fix: emit diagnostics across channels 2026-01-21 00:30:34 +00:00
bot-message.ts Telegram: remove @ts-nocheck from bot-message.ts (#9180) 2026-02-05 00:20:44 +00:00
bot-native-commands.plugin-auth.test.ts refactor: use shared pairing store for telegram 2026-02-01 15:22:37 +05:30
bot-native-commands.test.ts fix: add telegram command-cap regression test (#12356) (thanks @arosstale) 2026-02-09 22:27:03 +05:30
bot-native-commands.ts fix(auto-reply): prevent sender spoofing in group prompts 2026-02-10 00:44:38 -06:00
bot-updates.ts Telegram: use Grammy types directly, add typed Probe/Audit to plugin interface (#8403) 2026-02-04 10:09:28 +00:00
bot.create-telegram-bot.accepts-group-messages-mentionpatterns-match-without-botusername.test.ts test: speed up telegram suites 2026-02-01 22:23:16 +00:00
bot.create-telegram-bot.applies-topic-skill-filters-system-prompts.test.ts test: speed up telegram suites 2026-02-01 22:23:16 +00:00
bot.create-telegram-bot.blocks-all-group-messages-grouppolicy-is.test.ts test: speed up telegram suites 2026-02-01 22:23:16 +00:00
bot.create-telegram-bot.dedupes-duplicate-callback-query-updates-by-update.test.ts test: speed up telegram suites 2026-02-01 22:23:16 +00:00
bot.create-telegram-bot.installs-grammy-throttler.test.ts fix(pairing): use actual code in pairing approval text 2026-02-10 19:48:02 -05:00
bot.create-telegram-bot.matches-tg-prefixed-allowfrom-entries-case-insensitively.test.ts test: speed up telegram suites 2026-02-01 22:23:16 +00:00
bot.create-telegram-bot.matches-usernames-case-insensitively-grouppolicy-is.test.ts fix(telegram): add DM allowFrom regression tests 2026-02-09 22:59:47 +05:30
bot.create-telegram-bot.routes-dms-by-telegram-accountid-binding.test.ts fix(telegram): pass parentPeer for forum topic binding inheritance (#9789) 2026-02-05 18:25:03 +00:00
bot.create-telegram-bot.sends-replies-without-native-reply-threading.test.ts test: speed up telegram suites 2026-02-01 22:23:16 +00:00
bot.media.downloads-media-file-path-no-file-download.e2e.test.ts perf: speed up telegram media e2e flush timing 2026-02-13 19:52:45 +00:00
bot.media.includes-location-text-ctx-fields-pins.e2e.test.ts test: migrate suites to e2e coverage layout 2026-02-13 14:28:22 +00:00
bot.test.ts perf(test): trim fixture and import overhead in hot suites 2026-02-13 23:16:41 +00:00
bot.ts perf: speed up telegram media e2e flush timing 2026-02-13 19:52:45 +00:00
caption.ts refactor: share telegram caption splitting 2026-01-17 03:50:09 +00:00
download.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
download.ts fix(telegram): add timeout to file download to prevent DoS (CWE-400) 2026-02-02 13:39:39 +05:30
draft-chunking.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
draft-chunking.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
draft-stream.test.ts chore: Enable `typescript/no-explicit-any` rule. 2026-02-02 16:18:09 +09:00
draft-stream.ts fix: require thread specs for telegram sends 2026-02-02 09:26:59 +05:30
fetch.test.ts perf(test): reduce module reload overhead in key suites 2026-02-13 15:45:19 +00:00
fetch.ts perf(test): reduce module reload overhead in key suites 2026-02-13 15:45:19 +00:00
format.test.ts feat(telegram): render blockquotes as native <blockquote> tags (#14608) (#14626) 2026-02-12 08:11:57 -05:00
format.ts fix(telegram): auto-wrap .md file references in backticks to prevent URL previews (#8649) 2026-02-14 00:51:47 +01:00
format.wrap-md.test.ts fix(telegram): auto-wrap .md file references in backticks to prevent URL previews (#8649) 2026-02-14 00:51:47 +01:00
group-migration.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
group-migration.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
index.ts feat: unify provider reaction tools 2026-01-07 04:16:39 +01:00
inline-buttons.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
inline-buttons.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
model-buttons.test.ts Telegram: fix model button review issues 2026-02-04 09:23:17 +05:30
model-buttons.ts Telegram: fix model button review issues 2026-02-04 09:23:17 +05:30
monitor.test.ts fix(security): default standalone servers to loopback bind (#13184) 2026-02-13 16:39:56 +01:00
monitor.ts fix(security): default standalone servers to loopback bind (#13184) 2026-02-13 16:39:56 +01:00
network-config.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
network-config.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
network-errors.test.ts fix(telegram): recover from grammY "timed out" long-poll errors (#7239) 2026-02-02 22:37:22 +00:00
network-errors.ts fix(telegram): recover from grammY "timed out" long-poll errors (#7239) 2026-02-02 22:37:22 +00:00
probe.test.ts fix(telegram): add retry logic to health probe (openclaw#7405) thanks @mcinteerj 2026-02-12 09:11:35 -06:00
probe.ts perf: honor low timeout budgets in health telegram probes 2026-02-13 19:22:25 +00:00
proxy.test.ts fix: honor Telegram proxy dispatcher (#4456) (thanks @spiceoogway) 2026-01-30 14:38:39 +05:30
proxy.ts fix: align proxy fetch typing 2026-02-04 04:09:53 -08:00
reaction-level.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
reaction-level.ts refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
send.caption-split.test.ts Telegram: harden network retries and config 2026-01-26 19:36:43 -05:00
send.edit-message.test.ts feat(telegram): add edit message action (#2394) (thanks @marcelomar21) 2026-01-26 15:34:47 -08:00
send.preserves-thread-params-plain-text-fallback.test.ts Telegram: harden network retries and config 2026-01-26 19:36:43 -05:00
send.proxy.test.ts Telegram: harden network retries and config 2026-01-26 19:36:43 -05:00
send.returns-undefined-empty-input.test.ts fix: tighten thread-clear and telegram retry guards 2026-02-09 08:59:21 +05:30
send.ts fix(telegram): surface REACTION_INVALID as non-fatal warning (#14340) 2026-02-12 00:28:47 -06:00
send.video-note.test.ts feat: Implement Telegram video note support with tests and docs (#12408) 2026-02-09 07:00:57 +00:00
sent-message-cache.test.ts fix: lint errors 2026-01-15 17:07:38 +00:00
sent-message-cache.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
sticker-cache.test.ts refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
sticker-cache.ts feat: add Claude Opus 4.6 to built-in model catalog (#9853) 2026-02-05 12:09:23 -08:00
targets.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
targets.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
token.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
token.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
update-offset-store.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
update-offset-store.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
voice.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
voice.ts chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
webhook-set.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
webhook.test.ts chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
webhook.ts fix(security): enforce bounded webhook body handling 2026-02-13 19:14:54 +01:00