OpenClaw’s best automations only work if past decisions stay accessible. The OpenClaw docs are clear: memory is just Markdown that agents must read and write intentionally. This guide distills the latest official guidance on file layout, automatic memory flushes, vector indexing, and the new QMD backend so that openclaw memory management stops being guesswork.
TL;DR
- Memory lives in plain Markdown:
memory/YYYY-MM-DD.mdfor daily logs andMEMORY.mdfor curated long-term facts. Agents should read today + yesterday on startup and only openMEMORY.mdin private sessions. - Before context compaction kicks in, OpenClaw triggers a silent memory flush turn (default
softThresholdTokens = 4000,reserveTokensFloor = 20000) so agents persist durable notes withNO_REPLYnoise suppression. - Vector memory search selects an embedding provider automatically (
local → OpenAI → Gemini → Voyage → Mistral) and can fall back when API keys are missing; you can keep everything local throughmemorySearch.provider = "local"or Ollama. - Heavy-duty retention now has the QMD sidecar: OpenClaw can manage a Bun + node-llama-cpp process, index markdown plus optional extra paths, and fall back to the builtin SQLite indexer if QMD is unavailable.
- Operational hygiene still matters: scope memory search to DM contexts, monitor flush logs via
openclaw status, and keep workspaces writable so silent housekeeping can run.
How OpenClaw actually stores memory
OpenClaw’s memory model is intentionally simple—everything is Markdown inside the agent workspace (~/.openclaw/workspace by default).
- Daily context (
memory/YYYY-MM-DD.md) is append-only and meant for raw transcripts, state, and to-dos. Read the current and previous day at the start of every session for continuity. - Curated memory (
MEMORY.md) is optional and should only open in private/main sessions to avoid leaking long-term facts in group contexts. - The docs explicitly warn: if someone says “remember this,” do not trust the model’s latent state—write it down. This keeps memory consistent across restarts and compactions.
Your AI Receptionist, Live in Minutes.
Scale your front desk with an AI that never sleeps. Solvea handles unlimited multi-channel inquiries, books appointments into your calendar automatically, and ensures zero missed opportunities around the clock.
Automatic memory flush before compaction
OpenClaw watches token usage and, when a session approaches the compaction threshold, injects a silent turn so the agent can dump durable notes to disk before any truncation happens.
- Controlled by
agents.defaults.compaction.memoryFlush(enabled by default). - Default thresholds:
softThresholdTokens = 4000,reserveTokensFloor = 20000. WhencontextWindow - reserveTokensFloor - softThresholdTokensis crossed, the flush fires once per compaction cycle. - The flush turn contains both a system and user prompt instructing the agent to write to
memory/YYYY-MM-DD.mdand reply withNO_REPLYif nothing needs storing, so users never see the housekeeping. - Flushes require the workspace to be writable; sandboxing with
workspaceAccess: "ro"skips the event. Checkopenclaw statusor/statusif you suspect flushes aren’t running.
Vector memory search and provider selection
OpenClaw’s memory plugin (memory-core by default) indexes Markdown so agents can run semantic recall.
- Provider auto-selection: the gateway tries
localfirst (if a local model path exists), thenopenai,gemini,voyage, andmistral. If no credential resolves, memory search stays disabled until configured. - Local-first options: set
memorySearch.provider = "local"or"ollama"to avoid hosted APIs. Ollama mode hits/api/embeddingson your own node; a placeholder API key satisfies local policy if needed. - Extra content:
memorySearch.extraPathslets you fold in Markdown (and, with Gemini embedding 2, select image/audio) outside the defaultmemory/**/*.mdtree. - Scope control: memory search results only surface in sessions allowed by
memorySearch.scope. The default is DM-only; deny group channels unless you intentionally need shared recall.
QMD sidecar for large memory archives
For teams who need richer ranking or multimodal recall, OpenClaw can hand off indexing to QMD, an open-source Bun sidecar combining BM25 + vectors + reranking.
- Opt in with
memory.backend = "qmd". OpenClaw manages the sidecar under~/.openclaw/agents/<id>/qmd/, handlesqmd update/qmd embed, and retries with the builtin SQLite indexer if QMD fails. - Requirements: install the
qmdCLI separately, make sure Bun and an SQLite build with extension support exist (Homebrew sqlite works), and run on macOS/Linux or WSL2. - QMD collections can include
MEMORY.md,memory/**/*.md, extra directories, and even sanitized session transcripts whenmemory.qmd.sessions.enabled = true. - Searches run via
qmd search --json(default) with fallback toqmd querywhen a build rejects flags. First query may download GGUF models; OpenClaw setsXDG_CONFIG_HOME/XDG_CACHE_HOMEautomatically so caches stay agent-specific.
Operator playbook
- Treat memory files like code: version them or back them up along with the workspace so you can recover from disk issues.
- Audit silent flushes: enable verbose mode or watch
openclaw status --allso you know when compaction + flush cycles trigger. - Keep embeddings affordable: if hosted embeddings are too expensive, pin
memorySearch.provider = "local"or use Ollama; the docs emphasize that remote providers will bill against their respective keys. - Limit exposure: tighten
memorySearch.scope(deny group channels) and store long-term notes only in private sessions to prevent accidental disclosure. - Warm QMD if enabled: run
qmd update && qmd embedunder the agent’s XDG dirs after upgrades so the first live query doesn’t block an automation.
Conclusion
The newest OpenClaw docs make memory management straightforward: keep Markdown files tidy, let the automatic flush fire before compaction, configure vector search with explicit providers, and graduate to QMD when you need richer recall. With those controls in place, long-running agents keep their context without leaking secrets or exhausting embedding budgets.
FAQ
How do I keep OpenClaw from losing context mid-project?
Ensure the workspace is writable so the pre-compaction flush can write to memory/YYYY-MM-DD.md, and remind agents to append important details rather than expecting latent recall. Compaction plus memory flush is the safety net, not a substitute for explicit notes.
Can I run memory search without paying for hosted embeddings?
Yes. Set memorySearch.provider = "local" (node-llama-cpp) or "ollama" to stay on-device. You can still add extraPaths for shared Markdown while avoiding OpenAI/Gemini billing.
When should I enable the QMD backend?
Use QMD when your memory set grows beyond simple Markdown embeddings—e.g., tens of thousands of snippets, multi-GB archives, or when you need BM25 + reranking for higher precision. OpenClaw automatically falls back to the builtin indexer if QMD is unavailable, so it’s safe to experiment.






