mirror of https://github.com/openclaw/openclaw.git
When re-splitting CJK-heavy segments at chunking.tokens, check whether the slice boundary falls on a high surrogate (0xD800–0xDBFF) and if so extend by one code unit to keep the pair intact. Prevents producing broken surrogate halves for CJK Extension B+ characters (U+20000+). Add test verifying no lone surrogates appear when splitting lines of surrogate-pair characters with an odd token budget. Addresses third-round Codex P2 review comment. |
||
|---|---|---|
| .. | ||
| host | ||
| engine-embeddings.ts | ||
| engine-foundation.ts | ||
| engine-qmd.ts | ||
| engine-storage.ts | ||
| engine.ts | ||
| multimodal.ts | ||
| query.ts | ||
| runtime-cli.ts | ||
| runtime-core.ts | ||
| runtime-files.ts | ||
| runtime.ts | ||
| secret.ts | ||
| status.ts | ||