docs: rewrite sessions/memory section -- compaction, memory, and new memory-search page

This commit is contained in:
Vincent Koc 2026-03-30 07:09:40 +09:00
parent 6d9a7224aa
commit 3584a893e8
4 changed files with 561 additions and 136 deletions

View File

@ -1,98 +1,141 @@
---
summary: "Context window + compaction: how OpenClaw keeps sessions under model limits"
summary: "How OpenClaw compacts long sessions to stay within model context limits"
read_when:
- You want to understand auto-compaction and /compact
- You are debugging long sessions hitting context limits
- You want to tune compaction behavior or use a custom context engine
title: "Compaction"
---
# Context Window & Compaction
# Compaction
Every model has a **context window** (max tokens it can see). Long-running chats accumulate messages and tool results; once the window is tight, OpenClaw **compacts** older history to stay within limits.
Every model has a **context window** -- the maximum number of tokens it can see
at once. As a conversation grows, it eventually approaches that limit. OpenClaw
**compacts** older history into a summary so the session can continue without
losing important context.
## What compaction is
## How compaction works
Compaction **summarizes older conversation** into a compact summary entry and keeps recent messages intact. The summary is stored in the session history, so future requests use:
Compaction is a three-step process:
- The compaction summary
- Recent messages after the compaction point
1. **Summarize** older conversation turns into a compact summary.
2. **Persist** the summary as a `compaction` entry in the session transcript
(JSONL).
3. **Keep** recent messages after the compaction point intact.
Compaction **persists** in the sessions JSONL history.
After compaction, future turns see the summary plus all messages after the
compaction point. The on-disk transcript retains the full history -- compaction
only changes what gets loaded into the model context.
## Configuration
## Auto-compaction
Use the `agents.defaults.compaction` setting in your `openclaw.json` to configure compaction behavior (mode, target tokens, etc.).
Compaction summarization preserves opaque identifiers by default (`identifierPolicy: "strict"`). You can override this with `identifierPolicy: "off"` or provide custom text with `identifierPolicy: "custom"` and `identifierInstructions`.
Auto-compaction is **on by default**. It triggers in two situations:
You can optionally specify a different model for compaction summarization via `agents.defaults.compaction.model`. This is useful when your primary model is a local or small model and you want compaction summaries produced by a more capable model. The override accepts any `provider/model-id` string:
1. **Threshold maintenance** -- after a successful turn, when estimated context
usage exceeds `contextWindow - reserveTokens`.
2. **Overflow recovery** -- the model returns a context-overflow error. OpenClaw
compacts and retries the request.
```json
{
"agents": {
"defaults": {
"compaction": {
"model": "openrouter/anthropic/claude-sonnet-4-6"
}
}
}
}
```
When auto-compaction runs you will see:
This also works with local models, for example a second Ollama model dedicated to summarization or a fine-tuned compaction specialist:
- `Auto-compaction complete` in verbose mode
- `/status` showing `Compactions: <count>`
```json
{
"agents": {
"defaults": {
"compaction": {
"model": "ollama/llama3.1:8b"
}
}
}
}
```
### Pre-compaction memory flush
When unset, compaction uses the agent's primary model.
## Auto-compaction (default on)
When a session nears or exceeds the models context window, OpenClaw triggers auto-compaction and may retry the original request using the compacted context.
Youll see:
- `🧹 Auto-compaction complete` in verbose mode
- `/status` showing `🧹 Compactions: <count>`
Before compaction, OpenClaw can run a **silent memory flush** turn to store
durable notes to disk. See [Memory](/concepts/memory) for details and config.
Before compacting, OpenClaw can run a **silent turn** that reminds the model to
write durable notes to disk. This prevents important context from being lost in
the summary. The flush is controlled by `agents.defaults.compaction.memoryFlush`
and runs once per compaction cycle. See [Memory](/concepts/memory) for details.
## Manual compaction
Use `/compact` (optionally with instructions) to force a compaction pass:
Use `/compact` in any chat to force a compaction pass. You can optionally add
instructions to guide the summary:
```
/compact Focus on decisions and open questions
```
## Context window source
## Configuration
Context window is model-specific. OpenClaw uses the model definition from the configured provider catalog to determine limits.
### Compaction model
By default, compaction uses the agent's primary model. You can override this
with a different model for summarization -- useful when your primary model is
small or local and you want a more capable summarizer:
```json5
{
agents: {
defaults: {
compaction: {
model: "openrouter/anthropic/claude-sonnet-4-6",
},
},
},
}
```
### Reserve tokens and floor
- `reserveTokens` -- headroom reserved for prompts and the next model output
(Pi runtime default: `16384`).
- `reserveTokensFloor` -- minimum reserve enforced by OpenClaw (default:
`20000`). Set to `0` to disable.
- `keepRecentTokens` -- how many tokens of recent conversation to preserve
during compaction (default: `20000`).
### Identifier preservation
Compaction summaries preserve opaque identifiers by default
(`identifierPolicy: "strict"`). Override with:
- `"off"` -- no special identifier handling.
- `"custom"` -- provide your own instructions via `identifierInstructions`.
### Memory flush
```json5
{
agents: {
defaults: {
compaction: {
memoryFlush: {
enabled: true, // default
softThresholdTokens: 4000,
systemPrompt: "Session nearing compaction. Store durable memories now.",
prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.",
},
},
},
},
}
```
The flush triggers when context usage crosses
`contextWindow - reserveTokensFloor - softThresholdTokens`. It runs silently
(the user sees nothing) and is skipped when the workspace is read-only.
## Compaction vs pruning
- **Compaction**: summarises and **persists** in JSONL.
- **Session pruning**: trims old **tool results** only, **in-memory**, per request.
| | Compaction | Session pruning |
| ---------------- | ------------------------------ | -------------------------------- |
| **What it does** | Summarizes older conversation | Trims old tool results |
| **Persisted?** | Yes (in JSONL transcript) | No (in-memory only, per request) |
| **Scope** | Entire conversation history | Tool result messages only |
| **Frequency** | Once when threshold is reached | Every LLM call (when enabled) |
See [/concepts/session-pruning](/concepts/session-pruning) for pruning details.
See [Session Pruning](/concepts/session-pruning) for pruning details.
## OpenAI server-side compaction
OpenClaw also supports OpenAI Responses server-side compaction hints for
compatible direct OpenAI models. This is separate from local OpenClaw
compaction and can run alongside it.
OpenClaw also supports OpenAI Responses server-side compaction for compatible
direct OpenAI models. This is separate from local compaction and can run
alongside it:
- Local compaction: OpenClaw summarizes and persists into session JSONL.
- Server-side compaction: OpenAI compacts context on the provider side when
- **Local compaction** -- OpenClaw summarizes and persists into session JSONL.
- **Server-side compaction** -- OpenAI compacts context on the provider side when
`store` + `context_management` are enabled.
See [OpenAI provider](/providers/openai) for model params and overrides.
@ -100,24 +143,40 @@ See [OpenAI provider](/providers/openai) for model params and overrides.
## Custom context engines
Compaction behavior is owned by the active
[context engine](/concepts/context-engine). The legacy engine uses the built-in
[context engine](/concepts/context-engine). The built-in engine uses the
summarization described above. Plugin engines (selected via
`plugins.slots.contextEngine`) can implement any compaction strategy — DAG
summaries, vector retrieval, incremental condensation, etc.
`plugins.slots.contextEngine`) can implement any strategy -- DAG summaries,
vector retrieval, incremental condensation, etc.
When a plugin engine sets `ownsCompaction: true`, OpenClaw delegates all
compaction decisions to the engine and does not run built-in auto-compaction.
When `ownsCompaction` is `false` or unset, OpenClaw may still use Pi's
built-in in-attempt auto-compaction, but the active engine's `compact()` method
still handles `/compact` and overflow recovery. There is no automatic fallback
to the legacy engine's compaction path.
If you are building a non-owning context engine, implement `compact()` by
When `ownsCompaction` is `false` or unset, the built-in auto-compaction still
runs, but the engine's `compact()` method handles `/compact` and overflow
recovery. If you are building a non-owning engine, implement `compact()` by
calling `delegateCompactionToRuntime(...)` from `openclaw/plugin-sdk/core`.
## Tips
## Troubleshooting
- Use `/compact` when sessions feel stale or context is bloated.
- Large tool outputs are already truncated; pruning can further reduce tool-result buildup.
- If you need a fresh slate, `/new` or `/reset` starts a new session id.
**Compaction triggers too often?**
- Check the model's context window -- small models compact more frequently.
- High `reserveTokens` relative to the context window can trigger early
compaction.
- Large tool outputs accumulate fast. Enable
[session pruning](/concepts/session-pruning) to reduce tool-result buildup.
**Context feels stale after compaction?**
- Use `/compact Focus on <topic>` to guide the summary.
- Increase `keepRecentTokens` to preserve more recent conversation.
- Enable the [memory flush](/concepts/memory) so durable notes survive
compaction.
**Need a fresh start?**
- `/new` or `/reset` starts a new session ID without compacting.
For the full internal lifecycle (store schema, transcript structure, Pi runtime
semantics), see
[Session Management Deep Dive](/reference/session-management-compaction).

View File

@ -0,0 +1,248 @@
---
title: "Memory Search"
summary: "How OpenClaw memory search works -- embedding providers, hybrid search, MMR, and temporal decay"
read_when:
- You want to understand how memory_search retrieves results
- You want to tune hybrid search, MMR, or temporal decay
- You want to choose an embedding provider
---
# Memory Search
OpenClaw indexes workspace memory files (`MEMORY.md` and `memory/*.md`) into
chunks (~400 tokens, 80-token overlap) and searches them with `memory_search`.
This page explains how the search pipeline works and how to tune it. For the
file layout and memory basics, see [Memory](/concepts/memory).
## Search pipeline
```
Query -> Embedding -> Vector Search ─┐
├─> Weighted Merge -> Temporal Decay -> MMR -> Top-K
Query -> Tokenize -> BM25 Search ──┘
```
Both retrieval paths run in parallel when hybrid search is enabled. If either
path is unavailable (no embeddings or no FTS5), the other runs alone.
## Embedding providers
The default `memory-core` plugin ships built-in adapters for these providers:
| Provider | Adapter ID | Auto-selected | Notes |
| ---------- | ---------- | -------------------- | ----------------------------------- |
| Local GGUF | `local` | Yes (first priority) | node-llama-cpp, ~0.6 GB model |
| OpenAI | `openai` | Yes | `text-embedding-3-small` default |
| Gemini | `gemini` | Yes | Supports multimodal (images, audio) |
| Voyage | `voyage` | Yes | |
| Mistral | `mistral` | Yes | |
| Ollama | `ollama` | No (explicit only) | Local/self-hosted |
Auto-selection picks the first provider whose API key can be resolved. Set
`memorySearch.provider` explicitly to override.
Remote embeddings require an API key for the embedding provider. OpenClaw
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth covers chat/completions only and does not satisfy
embedding requests.
### Quick start
Enable memory search with OpenAI embeddings:
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
},
},
},
}
```
Or use local embeddings (no API key needed):
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "local",
},
},
},
}
```
Local mode uses node-llama-cpp and may require `pnpm approve-builds` to build
the native addon.
## Hybrid search (BM25 + vector)
When both FTS5 and embeddings are available, OpenClaw combines two retrieval
signals:
- **Vector similarity** -- semantic matching. Good at paraphrases ("Mac Studio
gateway host" vs "the machine running the gateway").
- **BM25 keyword relevance** -- exact token matching. Good at IDs, code symbols,
error strings, and config keys.
### How scores are merged
1. Retrieve a candidate pool from each side (top
`maxResults x candidateMultiplier`).
2. Convert BM25 rank to a 0-1 score: `textScore = 1 / (1 + max(0, bm25Rank))`.
3. Union candidates by chunk ID and compute:
`finalScore = vectorWeight x vectorScore + textWeight x textScore`.
Weights are normalized to 1.0, so they behave as percentages. If either path is
unavailable, the other runs alone with no hard failure.
### CJK support
FTS5 uses configurable trigram tokenization with a short-substring fallback so
Chinese, Japanese, and Korean text is searchable. CJK-heavy text is weighted
correctly during chunk-size estimation, and surrogate-pair characters are
preserved during fine splits.
## Post-processing
After merging scores, two optional stages refine the result list:
### Temporal decay (recency boost)
Daily notes accumulate over months. Without decay, a well-worded note from six
months ago can outrank yesterday's update on the same topic.
Temporal decay applies an exponential multiplier based on age:
```
decayedScore = score x e^(-lambda x ageInDays)
```
With the default half-life of 30 days:
| Age | Score retained |
| -------- | -------------- |
| Today | 100% |
| 7 days | ~84% |
| 30 days | 50% |
| 90 days | 12.5% |
| 180 days | ~1.6% |
**Evergreen files are never decayed** -- `MEMORY.md` and non-dated files in
`memory/` (like `memory/projects.md`) always rank at full score. Dated daily
files use the date from the filename.
**When to enable:** Your agent has months of daily notes and stale information
outranks recent context.
### MMR re-ranking (diversity)
When search returns results, multiple chunks may contain similar or overlapping
content. MMR (Maximal Marginal Relevance) re-ranks results to balance relevance
with diversity.
How it works:
1. Start with the highest-scoring result.
2. Iteratively select the next result that maximizes:
`lambda x relevance - (1 - lambda) x max_similarity_to_already_selected`.
3. Similarity is measured using Jaccard text similarity on tokenized content.
The `lambda` parameter controls the trade-off:
- `1.0` -- pure relevance (no diversity penalty).
- `0.0` -- maximum diversity (ignores relevance).
- Default: `0.7` (balanced, slight relevance bias).
**When to enable:** `memory_search` returns redundant or near-duplicate
snippets, especially with daily notes that repeat similar information.
## Configuration
Both post-processing features and hybrid search weights are configured under
`memorySearch.query.hybrid`:
```json5
{
agents: {
defaults: {
memorySearch: {
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
candidateMultiplier: 4,
mmr: {
enabled: true, // default: false
lambda: 0.7,
},
temporalDecay: {
enabled: true, // default: false
halfLifeDays: 30,
},
},
},
},
},
},
}
```
You can enable either feature independently:
- **MMR only** -- many similar notes but age does not matter.
- **Temporal decay only** -- recency matters but results are already diverse.
- **Both** -- recommended for agents with large, long-running daily note
histories.
## Session memory search (experimental)
You can optionally index session transcripts and surface them via
`memory_search`. This is gated behind an experimental flag:
```json5
{
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"],
},
},
},
}
```
Session indexing is opt-in and runs asynchronously. Results can be slightly stale
until background sync finishes. Session logs live on disk, so treat filesystem
access as the trust boundary.
## Troubleshooting
**`memory_search` returns nothing?**
- Check `openclaw memory status` -- is the index populated?
- Verify an embedding provider is configured and has a valid key.
- Run `openclaw memory index --force` to trigger a full reindex.
**Results are all keyword matches, no semantic results?**
- Embeddings may not be configured. Check `openclaw memory status --deep`.
- If using `local`, ensure node-llama-cpp built successfully.
**CJK text not found?**
- FTS5 trigram tokenization handles CJK. If results are missing, run
`openclaw memory index --force` to rebuild the FTS index.
## Further reading
- [Memory](/concepts/memory) -- file layout, backends, tools
- [Memory configuration reference](/reference/memory-config) -- all config knobs
including QMD, batch indexing, embedding cache, sqlite-vec, and multimodal

View File

@ -1,78 +1,187 @@
---
title: "Memory"
summary: "How OpenClaw memory works (workspace files + automatic memory flush)"
summary: "How OpenClaw memory works -- file layout, backends, search, and automatic flush"
read_when:
- You want the memory file layout and workflow
- You want to understand memory search and backends
- You want to tune the automatic pre-compaction memory flush
---
# Memory
OpenClaw memory is **plain Markdown in the agent workspace**. The files are the
source of truth; the model only "remembers" what gets written to disk.
source of truth -- the model only "remembers" what gets written to disk.
Memory search tools are provided by the active memory plugin (default:
`memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`.
## Memory files (Markdown)
## File layout
The default workspace layout uses two memory layers:
The default workspace uses two memory layers:
- `memory/YYYY-MM-DD.md`
- Daily log (append-only).
- Read today + yesterday at session start.
- `MEMORY.md` (optional)
- Curated long-term memory.
- If both `MEMORY.md` and `memory.md` exist at the workspace root, OpenClaw loads both (deduplicated by realpath so symlinks pointing to the same file are not injected twice).
- **Only load in the main, private session** (never in group contexts).
| Path | Purpose | Loaded at session start |
| ---------------------- | ------------------------ | -------------------------- |
| `memory/YYYY-MM-DD.md` | Daily log (append-only) | Today + yesterday |
| `MEMORY.md` | Curated long-term memory | Yes (main DM session only) |
These files live under the workspace (`agents.defaults.workspace`, default
`~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for the full layout.
If both `MEMORY.md` and `memory.md` exist at the workspace root, OpenClaw loads
both (deduplicated by realpath so symlinks are not injected twice). `MEMORY.md`
is only loaded in the main, private session -- never in group contexts.
## Memory tools
OpenClaw exposes two agent-facing tools for these Markdown files:
- `memory_search` -- semantic recall over indexed snippets.
- `memory_get` -- targeted read of a specific Markdown file/line range.
`memory_get` now **degrades gracefully when a file doesn't exist** (for example,
today's daily log before the first write). Both the builtin manager and the QMD
backend return `{ text: "", path }` instead of throwing `ENOENT`, so agents can
handle "nothing recorded yet" and continue their workflow without wrapping the
tool call in try/catch logic.
These files live under the agent workspace (`agents.defaults.workspace`, default
`~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for
the full layout.
## When to write memory
- Decisions, preferences, and durable facts go to `MEMORY.md`.
- Day-to-day notes and running context go to `memory/YYYY-MM-DD.md`.
- If someone says "remember this," write it down (do not keep it in RAM).
- This area is still evolving. It helps to remind the model to store memories; it will know what to do.
- **Decisions, preferences, and durable facts** go to `MEMORY.md`.
- **Day-to-day notes and running context** go to `memory/YYYY-MM-DD.md`.
- If someone says "remember this," **write it down** (do not keep it in RAM).
- If you want something to stick, **ask the bot to write it** into memory.
## Automatic memory flush (pre-compaction ping)
## Memory tools
When a session is **close to auto-compaction**, OpenClaw triggers a **silent,
agentic turn** that reminds the model to write durable memory **before** the
context is compacted. The default prompts explicitly say the model _may reply_,
but usually `NO_REPLY` is the correct response so the user never sees this turn.
The active memory plugin owns the prompt/path policy for that flush; the
default `memory-core` plugin writes to the canonical daily file under
`memory/YYYY-MM-DD.md`.
OpenClaw exposes two agent-facing tools:
This is controlled by `agents.defaults.compaction.memoryFlush`:
- **`memory_search`** -- semantic recall over indexed snippets. Uses the active
memory backend's search pipeline (vector similarity, keyword matching, or
hybrid).
- **`memory_get`** -- targeted read of a specific Markdown file or line range.
Degrades gracefully when a file does not exist (returns empty text instead of
an error).
## Memory backends
OpenClaw supports two memory backends that control how `memory_search` indexes
and retrieves content:
### Builtin (default)
The builtin backend uses a per-agent SQLite database with optional extensions:
- **FTS5 full-text search** for keyword matching (BM25 scoring).
- **sqlite-vec** for in-database vector similarity (falls back to in-process
cosine similarity when unavailable).
- **Hybrid search** combining BM25 + vector scores for best-of-both-worlds
retrieval.
- **CJK support** via configurable trigram tokenization with short-substring
fallback.
The builtin backend works out of the box with no extra dependencies. For
embedding vectors, configure an embedding provider (OpenAI, Gemini, Voyage,
Mistral, Ollama, or local GGUF). Without an embedding provider, only keyword
search is available.
Index location: `~/.openclaw/memory/<agentId>.sqlite`
### QMD (experimental)
[QMD](https://github.com/tobi/qmd) is a local-first search sidecar that
combines BM25 + vectors + reranking in a single binary. Set
`memory.backend = "qmd"` to opt in.
Key differences from the builtin backend:
- Runs as a subprocess (Bun + node-llama-cpp), auto-downloads GGUF models.
- Supports advanced post-processing: reranking, query expansion.
- Can index extra directories beyond the workspace (`memory.qmd.paths`).
- Can optionally index session transcripts (`memory.qmd.sessions`).
- Falls back to the builtin backend if QMD is unavailable.
QMD requires a separate install (`bun install -g https://github.com/tobi/qmd`)
and a SQLite build that allows extensions. See the
[Memory configuration reference](/reference/memory-config) for full setup.
## Memory search
When an embedding provider is configured, `memory_search` uses semantic vector
search to find relevant notes even when the wording differs from the query.
Hybrid search (BM25 + vector) is enabled by default when both FTS5 and
embeddings are available.
For details on how search works -- embedding providers, hybrid scoring, MMR
diversity re-ranking, temporal decay, and tuning -- see
[Memory Search](/concepts/memory-search).
### Embedding provider auto-selection
If `memorySearch.provider` is not set, OpenClaw auto-selects the first available
provider in this order:
1. `local` -- if `memorySearch.local.modelPath` is configured and exists.
2. `openai` -- if an OpenAI key can be resolved.
3. `gemini` -- if a Gemini key can be resolved.
4. `voyage` -- if a Voyage key can be resolved.
5. `mistral` -- if a Mistral key can be resolved.
If none can be resolved, memory search stays disabled until configured. Ollama
is supported but not auto-selected (set `memorySearch.provider = "ollama"`
explicitly).
## Additional memory paths
Index Markdown files outside the default workspace layout:
```json5
{
agents: {
defaults: {
memorySearch: {
extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"],
},
},
},
}
```
Paths can be absolute or workspace-relative. Directories are scanned
recursively for `.md` files. Symlinks are ignored.
## Multimodal memory (Gemini)
When using `gemini-embedding-2-preview`, OpenClaw can index image and audio
files from `memorySearch.extraPaths`:
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-2-preview",
extraPaths: ["assets/reference", "voice-notes"],
multimodal: {
enabled: true,
modalities: ["image", "audio"],
},
},
},
},
}
```
Search queries remain text, but Gemini can compare them against indexed
image/audio embeddings. `memory_get` still reads Markdown only.
See the [Memory configuration reference](/reference/memory-config) for supported
formats and limitations.
## Automatic memory flush
When a session is close to auto-compaction, OpenClaw runs a **silent turn** that
reminds the model to write durable notes before the context is summarized. This
prevents important information from being lost during compaction.
Controlled by `agents.defaults.compaction.memoryFlush`:
```json5
{
agents: {
defaults: {
compaction: {
reserveTokensFloor: 20000,
memoryFlush: {
enabled: true,
softThresholdTokens: 4000,
systemPrompt: "Session nearing compaction. Store durable memories now.",
prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.",
enabled: true, // default
softThresholdTokens: 4000, // how far below compaction threshold to trigger
},
},
},
@ -82,30 +191,38 @@ This is controlled by `agents.defaults.compaction.memoryFlush`:
Details:
- **Soft threshold**: flush triggers when the session token estimate crosses
- **Triggers** when context usage crosses
`contextWindow - reserveTokensFloor - softThresholdTokens`.
- **Silent** by default: prompts include `NO_REPLY` so nothing is delivered.
- **Two prompts**: a user prompt plus a system prompt append the reminder.
- **One flush per compaction cycle** (tracked in `sessions.json`).
- **Workspace must be writable**: if the session runs sandboxed with
`workspaceAccess: "ro"` or `"none"`, the flush is skipped.
- **Runs silently** -- prompts include `NO_REPLY` so nothing is delivered to the
user.
- **Once per compaction cycle** (tracked in `sessions.json`).
- **Skipped** when the workspace is read-only (`workspaceAccess: "ro"` or
`"none"`).
- The active memory plugin owns the flush prompt and path policy. The default
`memory-core` plugin writes to `memory/YYYY-MM-DD.md`.
For the full compaction lifecycle, see
[Session management + compaction](/reference/session-management-compaction).
For the full compaction lifecycle, see [Compaction](/concepts/compaction).
## Vector memory search
## CLI commands
OpenClaw can build a small vector index over `MEMORY.md` and `memory/*.md` so
semantic queries can find related notes even when wording differs. Hybrid search
(BM25 + vector) is available for combining semantic matching with exact keyword
lookups.
| Command | Description |
| -------------------------------- | ------------------------------------------ |
| `openclaw memory status` | Show memory index status and provider info |
| `openclaw memory search <query>` | Search memory from the command line |
| `openclaw memory index` | Force a reindex of memory files |
Memory search adapter ids come from the active memory plugin. The default
`memory-core` plugin ships built-ins for OpenAI, Gemini, Voyage, Mistral,
Ollama, and local GGUF models, plus an optional QMD sidecar backend for
advanced retrieval and post-processing features like MMR diversity re-ranking
and temporal decay.
Add `--agent <id>` to target a specific agent, `--deep` for extended
diagnostics, or `--json` for machine-readable output.
For the full configuration reference -- including embedding provider setup, QMD
backend, hybrid search tuning, multimodal memory, and all config knobs -- see
[Memory configuration reference](/reference/memory-config).
See [CLI: memory](/cli/memory) for the full command reference.
## Further reading
- [Memory Search](/concepts/memory-search) -- how search works, hybrid search,
MMR, temporal decay
- [Memory configuration reference](/reference/memory-config) -- all config knobs
for providers, QMD, hybrid search, batch indexing, and multimodal
- [Compaction](/concepts/compaction) -- how compaction interacts with memory
flush
- [Session Management Deep Dive](/reference/session-management-compaction) --
internal session and compaction lifecycle

View File

@ -1033,6 +1033,7 @@
"concepts/session-pruning",
"concepts/session-tool",
"concepts/memory",
"concepts/memory-search",
"concepts/compaction"
]
},