openclaw/packages/memory-host-sdk/src
AaronLuo00 f8547fcae4 fix: guard fine-split against breaking UTF-16 surrogate pairs
When re-splitting CJK-heavy segments at chunking.tokens, check whether the
slice boundary falls on a high surrogate (0xD800–0xDBFF) and if so extend
by one code unit to keep the pair intact.  Prevents producing broken
surrogate halves for CJK Extension B+ characters (U+20000+).

Add test verifying no lone surrogates appear when splitting lines of
surrogate-pair characters with an odd token budget.

Addresses third-round Codex P2 review comment.
2026-03-29 10:22:43 +09:00
..
host fix: guard fine-split against breaking UTF-16 surrogate pairs 2026-03-29 10:22:43 +09:00
engine-embeddings.ts refactor: finish moving provider runtime into extensions 2026-03-27 05:38:58 +00:00
engine-foundation.ts
engine-qmd.ts
engine-storage.ts
engine.ts
multimodal.ts
query.ts
runtime-cli.ts
runtime-core.ts
runtime-files.ts
runtime.ts
secret.ts
status.ts