diff --git a/docs/concepts/compaction.md b/docs/concepts/compaction.md index 87ef2aaf9b1..2082a18e7ce 100644 --- a/docs/concepts/compaction.md +++ b/docs/concepts/compaction.md @@ -1,98 +1,141 @@ --- -summary: "Context window + compaction: how OpenClaw keeps sessions under model limits" +summary: "How OpenClaw compacts long sessions to stay within model context limits" read_when: - You want to understand auto-compaction and /compact - You are debugging long sessions hitting context limits + - You want to tune compaction behavior or use a custom context engine title: "Compaction" --- -# Context Window & Compaction +# Compaction -Every model has a **context window** (max tokens it can see). Long-running chats accumulate messages and tool results; once the window is tight, OpenClaw **compacts** older history to stay within limits. +Every model has a **context window** -- the maximum number of tokens it can see +at once. As a conversation grows, it eventually approaches that limit. OpenClaw +**compacts** older history into a summary so the session can continue without +losing important context. -## What compaction is +## How compaction works -Compaction **summarizes older conversation** into a compact summary entry and keeps recent messages intact. The summary is stored in the session history, so future requests use: +Compaction is a three-step process: -- The compaction summary -- Recent messages after the compaction point +1. **Summarize** older conversation turns into a compact summary. +2. **Persist** the summary as a `compaction` entry in the session transcript + (JSONL). +3. **Keep** recent messages after the compaction point intact. -Compaction **persists** in the session’s JSONL history. +After compaction, future turns see the summary plus all messages after the +compaction point. The on-disk transcript retains the full history -- compaction +only changes what gets loaded into the model context. -## Configuration +## Auto-compaction -Use the `agents.defaults.compaction` setting in your `openclaw.json` to configure compaction behavior (mode, target tokens, etc.). -Compaction summarization preserves opaque identifiers by default (`identifierPolicy: "strict"`). You can override this with `identifierPolicy: "off"` or provide custom text with `identifierPolicy: "custom"` and `identifierInstructions`. +Auto-compaction is **on by default**. It triggers in two situations: -You can optionally specify a different model for compaction summarization via `agents.defaults.compaction.model`. This is useful when your primary model is a local or small model and you want compaction summaries produced by a more capable model. The override accepts any `provider/model-id` string: +1. **Threshold maintenance** -- after a successful turn, when estimated context + usage exceeds `contextWindow - reserveTokens`. +2. **Overflow recovery** -- the model returns a context-overflow error. OpenClaw + compacts and retries the request. -```json -{ - "agents": { - "defaults": { - "compaction": { - "model": "openrouter/anthropic/claude-sonnet-4-6" - } - } - } -} -``` +When auto-compaction runs you will see: -This also works with local models, for example a second Ollama model dedicated to summarization or a fine-tuned compaction specialist: +- `Auto-compaction complete` in verbose mode +- `/status` showing `Compactions: ` -```json -{ - "agents": { - "defaults": { - "compaction": { - "model": "ollama/llama3.1:8b" - } - } - } -} -``` +### Pre-compaction memory flush -When unset, compaction uses the agent's primary model. - -## Auto-compaction (default on) - -When a session nears or exceeds the model’s context window, OpenClaw triggers auto-compaction and may retry the original request using the compacted context. - -You’ll see: - -- `🧹 Auto-compaction complete` in verbose mode -- `/status` showing `🧹 Compactions: ` - -Before compaction, OpenClaw can run a **silent memory flush** turn to store -durable notes to disk. See [Memory](/concepts/memory) for details and config. +Before compacting, OpenClaw can run a **silent turn** that reminds the model to +write durable notes to disk. This prevents important context from being lost in +the summary. The flush is controlled by `agents.defaults.compaction.memoryFlush` +and runs once per compaction cycle. See [Memory](/concepts/memory) for details. ## Manual compaction -Use `/compact` (optionally with instructions) to force a compaction pass: +Use `/compact` in any chat to force a compaction pass. You can optionally add +instructions to guide the summary: ``` /compact Focus on decisions and open questions ``` -## Context window source +## Configuration -Context window is model-specific. OpenClaw uses the model definition from the configured provider catalog to determine limits. +### Compaction model + +By default, compaction uses the agent's primary model. You can override this +with a different model for summarization -- useful when your primary model is +small or local and you want a more capable summarizer: + +```json5 +{ + agents: { + defaults: { + compaction: { + model: "openrouter/anthropic/claude-sonnet-4-6", + }, + }, + }, +} +``` + +### Reserve tokens and floor + +- `reserveTokens` -- headroom reserved for prompts and the next model output + (Pi runtime default: `16384`). +- `reserveTokensFloor` -- minimum reserve enforced by OpenClaw (default: + `20000`). Set to `0` to disable. +- `keepRecentTokens` -- how many tokens of recent conversation to preserve + during compaction (default: `20000`). + +### Identifier preservation + +Compaction summaries preserve opaque identifiers by default +(`identifierPolicy: "strict"`). Override with: + +- `"off"` -- no special identifier handling. +- `"custom"` -- provide your own instructions via `identifierInstructions`. + +### Memory flush + +```json5 +{ + agents: { + defaults: { + compaction: { + memoryFlush: { + enabled: true, // default + softThresholdTokens: 4000, + systemPrompt: "Session nearing compaction. Store durable memories now.", + prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.", + }, + }, + }, + }, +} +``` + +The flush triggers when context usage crosses +`contextWindow - reserveTokensFloor - softThresholdTokens`. It runs silently +(the user sees nothing) and is skipped when the workspace is read-only. ## Compaction vs pruning -- **Compaction**: summarises and **persists** in JSONL. -- **Session pruning**: trims old **tool results** only, **in-memory**, per request. +| | Compaction | Session pruning | +| ---------------- | ------------------------------ | -------------------------------- | +| **What it does** | Summarizes older conversation | Trims old tool results | +| **Persisted?** | Yes (in JSONL transcript) | No (in-memory only, per request) | +| **Scope** | Entire conversation history | Tool result messages only | +| **Frequency** | Once when threshold is reached | Every LLM call (when enabled) | -See [/concepts/session-pruning](/concepts/session-pruning) for pruning details. +See [Session Pruning](/concepts/session-pruning) for pruning details. ## OpenAI server-side compaction -OpenClaw also supports OpenAI Responses server-side compaction hints for -compatible direct OpenAI models. This is separate from local OpenClaw -compaction and can run alongside it. +OpenClaw also supports OpenAI Responses server-side compaction for compatible +direct OpenAI models. This is separate from local compaction and can run +alongside it: -- Local compaction: OpenClaw summarizes and persists into session JSONL. -- Server-side compaction: OpenAI compacts context on the provider side when +- **Local compaction** -- OpenClaw summarizes and persists into session JSONL. +- **Server-side compaction** -- OpenAI compacts context on the provider side when `store` + `context_management` are enabled. See [OpenAI provider](/providers/openai) for model params and overrides. @@ -100,24 +143,40 @@ See [OpenAI provider](/providers/openai) for model params and overrides. ## Custom context engines Compaction behavior is owned by the active -[context engine](/concepts/context-engine). The legacy engine uses the built-in +[context engine](/concepts/context-engine). The built-in engine uses the summarization described above. Plugin engines (selected via -`plugins.slots.contextEngine`) can implement any compaction strategy — DAG -summaries, vector retrieval, incremental condensation, etc. +`plugins.slots.contextEngine`) can implement any strategy -- DAG summaries, +vector retrieval, incremental condensation, etc. When a plugin engine sets `ownsCompaction: true`, OpenClaw delegates all compaction decisions to the engine and does not run built-in auto-compaction. -When `ownsCompaction` is `false` or unset, OpenClaw may still use Pi's -built-in in-attempt auto-compaction, but the active engine's `compact()` method -still handles `/compact` and overflow recovery. There is no automatic fallback -to the legacy engine's compaction path. - -If you are building a non-owning context engine, implement `compact()` by +When `ownsCompaction` is `false` or unset, the built-in auto-compaction still +runs, but the engine's `compact()` method handles `/compact` and overflow +recovery. If you are building a non-owning engine, implement `compact()` by calling `delegateCompactionToRuntime(...)` from `openclaw/plugin-sdk/core`. -## Tips +## Troubleshooting -- Use `/compact` when sessions feel stale or context is bloated. -- Large tool outputs are already truncated; pruning can further reduce tool-result buildup. -- If you need a fresh slate, `/new` or `/reset` starts a new session id. +**Compaction triggers too often?** + +- Check the model's context window -- small models compact more frequently. +- High `reserveTokens` relative to the context window can trigger early + compaction. +- Large tool outputs accumulate fast. Enable + [session pruning](/concepts/session-pruning) to reduce tool-result buildup. + +**Context feels stale after compaction?** + +- Use `/compact Focus on ` to guide the summary. +- Increase `keepRecentTokens` to preserve more recent conversation. +- Enable the [memory flush](/concepts/memory) so durable notes survive + compaction. + +**Need a fresh start?** + +- `/new` or `/reset` starts a new session ID without compacting. + +For the full internal lifecycle (store schema, transcript structure, Pi runtime +semantics), see +[Session Management Deep Dive](/reference/session-management-compaction). diff --git a/docs/concepts/memory-search.md b/docs/concepts/memory-search.md new file mode 100644 index 00000000000..7b1fb5a72e4 --- /dev/null +++ b/docs/concepts/memory-search.md @@ -0,0 +1,248 @@ +--- +title: "Memory Search" +summary: "How OpenClaw memory search works -- embedding providers, hybrid search, MMR, and temporal decay" +read_when: + - You want to understand how memory_search retrieves results + - You want to tune hybrid search, MMR, or temporal decay + - You want to choose an embedding provider +--- + +# Memory Search + +OpenClaw indexes workspace memory files (`MEMORY.md` and `memory/*.md`) into +chunks (~400 tokens, 80-token overlap) and searches them with `memory_search`. +This page explains how the search pipeline works and how to tune it. For the +file layout and memory basics, see [Memory](/concepts/memory). + +## Search pipeline + +``` +Query -> Embedding -> Vector Search ─┐ + ├─> Weighted Merge -> Temporal Decay -> MMR -> Top-K +Query -> Tokenize -> BM25 Search ──┘ +``` + +Both retrieval paths run in parallel when hybrid search is enabled. If either +path is unavailable (no embeddings or no FTS5), the other runs alone. + +## Embedding providers + +The default `memory-core` plugin ships built-in adapters for these providers: + +| Provider | Adapter ID | Auto-selected | Notes | +| ---------- | ---------- | -------------------- | ----------------------------------- | +| Local GGUF | `local` | Yes (first priority) | node-llama-cpp, ~0.6 GB model | +| OpenAI | `openai` | Yes | `text-embedding-3-small` default | +| Gemini | `gemini` | Yes | Supports multimodal (images, audio) | +| Voyage | `voyage` | Yes | | +| Mistral | `mistral` | Yes | | +| Ollama | `ollama` | No (explicit only) | Local/self-hosted | + +Auto-selection picks the first provider whose API key can be resolved. Set +`memorySearch.provider` explicitly to override. + +Remote embeddings require an API key for the embedding provider. OpenClaw +resolves keys from auth profiles, `models.providers.*.apiKey`, or environment +variables. Codex OAuth covers chat/completions only and does not satisfy +embedding requests. + +### Quick start + +Enable memory search with OpenAI embeddings: + +```json5 +{ + agents: { + defaults: { + memorySearch: { + provider: "openai", + model: "text-embedding-3-small", + }, + }, + }, +} +``` + +Or use local embeddings (no API key needed): + +```json5 +{ + agents: { + defaults: { + memorySearch: { + provider: "local", + }, + }, + }, +} +``` + +Local mode uses node-llama-cpp and may require `pnpm approve-builds` to build +the native addon. + +## Hybrid search (BM25 + vector) + +When both FTS5 and embeddings are available, OpenClaw combines two retrieval +signals: + +- **Vector similarity** -- semantic matching. Good at paraphrases ("Mac Studio + gateway host" vs "the machine running the gateway"). +- **BM25 keyword relevance** -- exact token matching. Good at IDs, code symbols, + error strings, and config keys. + +### How scores are merged + +1. Retrieve a candidate pool from each side (top + `maxResults x candidateMultiplier`). +2. Convert BM25 rank to a 0-1 score: `textScore = 1 / (1 + max(0, bm25Rank))`. +3. Union candidates by chunk ID and compute: + `finalScore = vectorWeight x vectorScore + textWeight x textScore`. + +Weights are normalized to 1.0, so they behave as percentages. If either path is +unavailable, the other runs alone with no hard failure. + +### CJK support + +FTS5 uses configurable trigram tokenization with a short-substring fallback so +Chinese, Japanese, and Korean text is searchable. CJK-heavy text is weighted +correctly during chunk-size estimation, and surrogate-pair characters are +preserved during fine splits. + +## Post-processing + +After merging scores, two optional stages refine the result list: + +### Temporal decay (recency boost) + +Daily notes accumulate over months. Without decay, a well-worded note from six +months ago can outrank yesterday's update on the same topic. + +Temporal decay applies an exponential multiplier based on age: + +``` +decayedScore = score x e^(-lambda x ageInDays) +``` + +With the default half-life of 30 days: + +| Age | Score retained | +| -------- | -------------- | +| Today | 100% | +| 7 days | ~84% | +| 30 days | 50% | +| 90 days | 12.5% | +| 180 days | ~1.6% | + +**Evergreen files are never decayed** -- `MEMORY.md` and non-dated files in +`memory/` (like `memory/projects.md`) always rank at full score. Dated daily +files use the date from the filename. + +**When to enable:** Your agent has months of daily notes and stale information +outranks recent context. + +### MMR re-ranking (diversity) + +When search returns results, multiple chunks may contain similar or overlapping +content. MMR (Maximal Marginal Relevance) re-ranks results to balance relevance +with diversity. + +How it works: + +1. Start with the highest-scoring result. +2. Iteratively select the next result that maximizes: + `lambda x relevance - (1 - lambda) x max_similarity_to_already_selected`. +3. Similarity is measured using Jaccard text similarity on tokenized content. + +The `lambda` parameter controls the trade-off: + +- `1.0` -- pure relevance (no diversity penalty). +- `0.0` -- maximum diversity (ignores relevance). +- Default: `0.7` (balanced, slight relevance bias). + +**When to enable:** `memory_search` returns redundant or near-duplicate +snippets, especially with daily notes that repeat similar information. + +## Configuration + +Both post-processing features and hybrid search weights are configured under +`memorySearch.query.hybrid`: + +```json5 +{ + agents: { + defaults: { + memorySearch: { + query: { + hybrid: { + enabled: true, + vectorWeight: 0.7, + textWeight: 0.3, + candidateMultiplier: 4, + mmr: { + enabled: true, // default: false + lambda: 0.7, + }, + temporalDecay: { + enabled: true, // default: false + halfLifeDays: 30, + }, + }, + }, + }, + }, + }, +} +``` + +You can enable either feature independently: + +- **MMR only** -- many similar notes but age does not matter. +- **Temporal decay only** -- recency matters but results are already diverse. +- **Both** -- recommended for agents with large, long-running daily note + histories. + +## Session memory search (experimental) + +You can optionally index session transcripts and surface them via +`memory_search`. This is gated behind an experimental flag: + +```json5 +{ + agents: { + defaults: { + memorySearch: { + experimental: { sessionMemory: true }, + sources: ["memory", "sessions"], + }, + }, + }, +} +``` + +Session indexing is opt-in and runs asynchronously. Results can be slightly stale +until background sync finishes. Session logs live on disk, so treat filesystem +access as the trust boundary. + +## Troubleshooting + +**`memory_search` returns nothing?** + +- Check `openclaw memory status` -- is the index populated? +- Verify an embedding provider is configured and has a valid key. +- Run `openclaw memory index --force` to trigger a full reindex. + +**Results are all keyword matches, no semantic results?** + +- Embeddings may not be configured. Check `openclaw memory status --deep`. +- If using `local`, ensure node-llama-cpp built successfully. + +**CJK text not found?** + +- FTS5 trigram tokenization handles CJK. If results are missing, run + `openclaw memory index --force` to rebuild the FTS index. + +## Further reading + +- [Memory](/concepts/memory) -- file layout, backends, tools +- [Memory configuration reference](/reference/memory-config) -- all config knobs + including QMD, batch indexing, embedding cache, sqlite-vec, and multimodal diff --git a/docs/concepts/memory.md b/docs/concepts/memory.md index 837343b1d1d..e555d86a30f 100644 --- a/docs/concepts/memory.md +++ b/docs/concepts/memory.md @@ -1,78 +1,187 @@ --- title: "Memory" -summary: "How OpenClaw memory works (workspace files + automatic memory flush)" +summary: "How OpenClaw memory works -- file layout, backends, search, and automatic flush" read_when: - You want the memory file layout and workflow + - You want to understand memory search and backends - You want to tune the automatic pre-compaction memory flush --- # Memory OpenClaw memory is **plain Markdown in the agent workspace**. The files are the -source of truth; the model only "remembers" what gets written to disk. +source of truth -- the model only "remembers" what gets written to disk. Memory search tools are provided by the active memory plugin (default: `memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`. -## Memory files (Markdown) +## File layout -The default workspace layout uses two memory layers: +The default workspace uses two memory layers: -- `memory/YYYY-MM-DD.md` - - Daily log (append-only). - - Read today + yesterday at session start. -- `MEMORY.md` (optional) - - Curated long-term memory. - - If both `MEMORY.md` and `memory.md` exist at the workspace root, OpenClaw loads both (deduplicated by realpath so symlinks pointing to the same file are not injected twice). - - **Only load in the main, private session** (never in group contexts). +| Path | Purpose | Loaded at session start | +| ---------------------- | ------------------------ | -------------------------- | +| `memory/YYYY-MM-DD.md` | Daily log (append-only) | Today + yesterday | +| `MEMORY.md` | Curated long-term memory | Yes (main DM session only) | -These files live under the workspace (`agents.defaults.workspace`, default -`~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for the full layout. +If both `MEMORY.md` and `memory.md` exist at the workspace root, OpenClaw loads +both (deduplicated by realpath so symlinks are not injected twice). `MEMORY.md` +is only loaded in the main, private session -- never in group contexts. -## Memory tools - -OpenClaw exposes two agent-facing tools for these Markdown files: - -- `memory_search` -- semantic recall over indexed snippets. -- `memory_get` -- targeted read of a specific Markdown file/line range. - -`memory_get` now **degrades gracefully when a file doesn't exist** (for example, -today's daily log before the first write). Both the builtin manager and the QMD -backend return `{ text: "", path }` instead of throwing `ENOENT`, so agents can -handle "nothing recorded yet" and continue their workflow without wrapping the -tool call in try/catch logic. +These files live under the agent workspace (`agents.defaults.workspace`, default +`~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for +the full layout. ## When to write memory -- Decisions, preferences, and durable facts go to `MEMORY.md`. -- Day-to-day notes and running context go to `memory/YYYY-MM-DD.md`. -- If someone says "remember this," write it down (do not keep it in RAM). -- This area is still evolving. It helps to remind the model to store memories; it will know what to do. +- **Decisions, preferences, and durable facts** go to `MEMORY.md`. +- **Day-to-day notes and running context** go to `memory/YYYY-MM-DD.md`. +- If someone says "remember this," **write it down** (do not keep it in RAM). - If you want something to stick, **ask the bot to write it** into memory. -## Automatic memory flush (pre-compaction ping) +## Memory tools -When a session is **close to auto-compaction**, OpenClaw triggers a **silent, -agentic turn** that reminds the model to write durable memory **before** the -context is compacted. The default prompts explicitly say the model _may reply_, -but usually `NO_REPLY` is the correct response so the user never sees this turn. -The active memory plugin owns the prompt/path policy for that flush; the -default `memory-core` plugin writes to the canonical daily file under -`memory/YYYY-MM-DD.md`. +OpenClaw exposes two agent-facing tools: -This is controlled by `agents.defaults.compaction.memoryFlush`: +- **`memory_search`** -- semantic recall over indexed snippets. Uses the active + memory backend's search pipeline (vector similarity, keyword matching, or + hybrid). +- **`memory_get`** -- targeted read of a specific Markdown file or line range. + Degrades gracefully when a file does not exist (returns empty text instead of + an error). + +## Memory backends + +OpenClaw supports two memory backends that control how `memory_search` indexes +and retrieves content: + +### Builtin (default) + +The builtin backend uses a per-agent SQLite database with optional extensions: + +- **FTS5 full-text search** for keyword matching (BM25 scoring). +- **sqlite-vec** for in-database vector similarity (falls back to in-process + cosine similarity when unavailable). +- **Hybrid search** combining BM25 + vector scores for best-of-both-worlds + retrieval. +- **CJK support** via configurable trigram tokenization with short-substring + fallback. + +The builtin backend works out of the box with no extra dependencies. For +embedding vectors, configure an embedding provider (OpenAI, Gemini, Voyage, +Mistral, Ollama, or local GGUF). Without an embedding provider, only keyword +search is available. + +Index location: `~/.openclaw/memory/.sqlite` + +### QMD (experimental) + +[QMD](https://github.com/tobi/qmd) is a local-first search sidecar that +combines BM25 + vectors + reranking in a single binary. Set +`memory.backend = "qmd"` to opt in. + +Key differences from the builtin backend: + +- Runs as a subprocess (Bun + node-llama-cpp), auto-downloads GGUF models. +- Supports advanced post-processing: reranking, query expansion. +- Can index extra directories beyond the workspace (`memory.qmd.paths`). +- Can optionally index session transcripts (`memory.qmd.sessions`). +- Falls back to the builtin backend if QMD is unavailable. + +QMD requires a separate install (`bun install -g https://github.com/tobi/qmd`) +and a SQLite build that allows extensions. See the +[Memory configuration reference](/reference/memory-config) for full setup. + +## Memory search + +When an embedding provider is configured, `memory_search` uses semantic vector +search to find relevant notes even when the wording differs from the query. +Hybrid search (BM25 + vector) is enabled by default when both FTS5 and +embeddings are available. + +For details on how search works -- embedding providers, hybrid scoring, MMR +diversity re-ranking, temporal decay, and tuning -- see +[Memory Search](/concepts/memory-search). + +### Embedding provider auto-selection + +If `memorySearch.provider` is not set, OpenClaw auto-selects the first available +provider in this order: + +1. `local` -- if `memorySearch.local.modelPath` is configured and exists. +2. `openai` -- if an OpenAI key can be resolved. +3. `gemini` -- if a Gemini key can be resolved. +4. `voyage` -- if a Voyage key can be resolved. +5. `mistral` -- if a Mistral key can be resolved. + +If none can be resolved, memory search stays disabled until configured. Ollama +is supported but not auto-selected (set `memorySearch.provider = "ollama"` +explicitly). + +## Additional memory paths + +Index Markdown files outside the default workspace layout: + +```json5 +{ + agents: { + defaults: { + memorySearch: { + extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"], + }, + }, + }, +} +``` + +Paths can be absolute or workspace-relative. Directories are scanned +recursively for `.md` files. Symlinks are ignored. + +## Multimodal memory (Gemini) + +When using `gemini-embedding-2-preview`, OpenClaw can index image and audio +files from `memorySearch.extraPaths`: + +```json5 +{ + agents: { + defaults: { + memorySearch: { + provider: "gemini", + model: "gemini-embedding-2-preview", + extraPaths: ["assets/reference", "voice-notes"], + multimodal: { + enabled: true, + modalities: ["image", "audio"], + }, + }, + }, + }, +} +``` + +Search queries remain text, but Gemini can compare them against indexed +image/audio embeddings. `memory_get` still reads Markdown only. + +See the [Memory configuration reference](/reference/memory-config) for supported +formats and limitations. + +## Automatic memory flush + +When a session is close to auto-compaction, OpenClaw runs a **silent turn** that +reminds the model to write durable notes before the context is summarized. This +prevents important information from being lost during compaction. + +Controlled by `agents.defaults.compaction.memoryFlush`: ```json5 { agents: { defaults: { compaction: { - reserveTokensFloor: 20000, memoryFlush: { - enabled: true, - softThresholdTokens: 4000, - systemPrompt: "Session nearing compaction. Store durable memories now.", - prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.", + enabled: true, // default + softThresholdTokens: 4000, // how far below compaction threshold to trigger }, }, }, @@ -82,30 +191,38 @@ This is controlled by `agents.defaults.compaction.memoryFlush`: Details: -- **Soft threshold**: flush triggers when the session token estimate crosses +- **Triggers** when context usage crosses `contextWindow - reserveTokensFloor - softThresholdTokens`. -- **Silent** by default: prompts include `NO_REPLY` so nothing is delivered. -- **Two prompts**: a user prompt plus a system prompt append the reminder. -- **One flush per compaction cycle** (tracked in `sessions.json`). -- **Workspace must be writable**: if the session runs sandboxed with - `workspaceAccess: "ro"` or `"none"`, the flush is skipped. +- **Runs silently** -- prompts include `NO_REPLY` so nothing is delivered to the + user. +- **Once per compaction cycle** (tracked in `sessions.json`). +- **Skipped** when the workspace is read-only (`workspaceAccess: "ro"` or + `"none"`). +- The active memory plugin owns the flush prompt and path policy. The default + `memory-core` plugin writes to `memory/YYYY-MM-DD.md`. -For the full compaction lifecycle, see -[Session management + compaction](/reference/session-management-compaction). +For the full compaction lifecycle, see [Compaction](/concepts/compaction). -## Vector memory search +## CLI commands -OpenClaw can build a small vector index over `MEMORY.md` and `memory/*.md` so -semantic queries can find related notes even when wording differs. Hybrid search -(BM25 + vector) is available for combining semantic matching with exact keyword -lookups. +| Command | Description | +| -------------------------------- | ------------------------------------------ | +| `openclaw memory status` | Show memory index status and provider info | +| `openclaw memory search ` | Search memory from the command line | +| `openclaw memory index` | Force a reindex of memory files | -Memory search adapter ids come from the active memory plugin. The default -`memory-core` plugin ships built-ins for OpenAI, Gemini, Voyage, Mistral, -Ollama, and local GGUF models, plus an optional QMD sidecar backend for -advanced retrieval and post-processing features like MMR diversity re-ranking -and temporal decay. +Add `--agent ` to target a specific agent, `--deep` for extended +diagnostics, or `--json` for machine-readable output. -For the full configuration reference -- including embedding provider setup, QMD -backend, hybrid search tuning, multimodal memory, and all config knobs -- see -[Memory configuration reference](/reference/memory-config). +See [CLI: memory](/cli/memory) for the full command reference. + +## Further reading + +- [Memory Search](/concepts/memory-search) -- how search works, hybrid search, + MMR, temporal decay +- [Memory configuration reference](/reference/memory-config) -- all config knobs + for providers, QMD, hybrid search, batch indexing, and multimodal +- [Compaction](/concepts/compaction) -- how compaction interacts with memory + flush +- [Session Management Deep Dive](/reference/session-management-compaction) -- + internal session and compaction lifecycle diff --git a/docs/docs.json b/docs/docs.json index e0cb9ce84b7..472a140cc52 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -1033,6 +1033,7 @@ "concepts/session-pruning", "concepts/session-tool", "concepts/memory", + "concepts/memory-search", "concepts/compaction" ] },