openclaw/extensions/memory-core/src/memory
buyitsydney 4b69c6d3f1 fix(memory): add CJK/Kana/Hangul support to MMR tokenize() for diversity detection
The tokenize() function only matched [a-z0-9_]+ patterns, returning an
empty set for CJK-only text. This made Jaccard similarity always 0 (or
always 1 for two empty sets) for CJK content, effectively disabling MMR
diversity detection.

Add support for:
- CJK Unified Ideographs (U+4E00–U+9FFF, U+3400–U+4DBF)
- Hiragana (U+3040–U+309F) and Katakana (U+30A0–U+30FF)
- Hangul Syllables (U+AC00–U+D7AF) and Jamo (U+1100–U+11FF)

Characters are extracted as unigrams, and bigrams are generated only
from characters that are adjacent in the original text (no spurious
bigrams across ASCII boundaries).

Fixes #28000
2026-03-28 09:19:52 +05:30
..
test-helpers
embedding-manager.test-harness.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
embedding.test-mocks.ts
embeddings.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
hybrid.test.ts
hybrid.ts
index.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
index.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager-embedding-ops.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager-runtime.ts
manager-search.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager-sync-ops.ts refactor: move memory host into sdk package 2026-03-27 04:12:04 +00:00
manager.async-search.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.atomic-reindex.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.batch.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.embedding-batches.test.ts test: fix memory-core host type import 2026-03-27 05:38:58 +00:00
manager.get-concurrency.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.mistral-provider.test.ts refactor: remove ollama legacy shims 2026-03-27 06:38:23 +00:00
manager.read-file.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.readonly-recovery.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.sync-errors-do-not-crash.test.ts
manager.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
manager.vector-dedupe.test.ts refactor: move memory host into sdk package 2026-03-27 04:12:04 +00:00
manager.watcher-config.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
mmr.test.ts fix(memory): add CJK/Kana/Hangul support to MMR tokenize() for diversity detection 2026-03-28 09:19:52 +05:30
mmr.ts fix(memory): add CJK/Kana/Hangul support to MMR tokenize() for diversity detection 2026-03-28 09:19:52 +05:30
provider-adapters.ts refactor: move bundled plugin policy into manifests 2026-03-27 16:40:27 +00:00
qmd-manager.test.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
qmd-manager.ts refactor: move memory host into sdk package 2026-03-27 04:12:04 +00:00
search-manager.test.ts Reduce lint suppressions in core tests and runtime 2026-03-27 02:11:26 -05:00
search-manager.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
temporal-decay.test.ts
temporal-decay.ts
test-embeddings-mock.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
test-manager-helpers.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
test-manager.ts refactor: remove memory-core engine barrel 2026-03-27 03:35:00 +00:00
test-runtime-mocks.ts