Your AI receptionist, live in 3 minutes. Free to start →

OpenClaw Memory Management: Keep Agents Sharp After Marathon Sessions

Written byIvy Chen
Last updated: March 25, 2026Expert Verified

OpenClaw’s best automations only work if past decisions stay accessible. The OpenClaw docs are clear: memory is just Markdown that agents must read and write intentionally. This guide distills the latest official guidance on file layout, automatic memory flushes, vector indexing, and the new QMD backend so that openclaw memory management stops being guesswork.

TL;DR

  • Memory lives in plain Markdown: memory/YYYY-MM-DD.md for daily logs and MEMORY.md for curated long-term facts. Agents should read today + yesterday on startup and only open MEMORY.md in private sessions.
  • Before context compaction kicks in, OpenClaw triggers a silent memory flush turn (default softThresholdTokens = 4000, reserveTokensFloor = 20000) so agents persist durable notes with NO_REPLY noise suppression.
  • Vector memory search selects an embedding provider automatically (local → OpenAI → Gemini → Voyage → Mistral) and can fall back when API keys are missing; you can keep everything local through memorySearch.provider = "local" or Ollama.
  • Heavy-duty retention now has the QMD sidecar: OpenClaw can manage a Bun + node-llama-cpp process, index markdown plus optional extra paths, and fall back to the builtin SQLite indexer if QMD is unavailable.
  • Operational hygiene still matters: scope memory search to DM contexts, monitor flush logs via openclaw status, and keep workspaces writable so silent housekeeping can run.

How OpenClaw actually stores memory

OpenClaw’s memory model is intentionally simple—everything is Markdown inside the agent workspace (~/.openclaw/workspace by default).

  • Daily context (memory/YYYY-MM-DD.md) is append-only and meant for raw transcripts, state, and to-dos. Read the current and previous day at the start of every session for continuity.
  • Curated memory (MEMORY.md) is optional and should only open in private/main sessions to avoid leaking long-term facts in group contexts.
  • The docs explicitly warn: if someone says “remember this,” do not trust the model’s latent state—write it down. This keeps memory consistent across restarts and compactions.

Your AI Receptionist, Live in Minutes.

Scale your front desk with an AI that never sleeps. Solvea handles unlimited multi-channel inquiries, books appointments into your calendar automatically, and ensures zero missed opportunities around the clock.

Start for Free

Automatic memory flush before compaction

OpenClaw watches token usage and, when a session approaches the compaction threshold, injects a silent turn so the agent can dump durable notes to disk before any truncation happens.

  • Controlled by agents.defaults.compaction.memoryFlush (enabled by default).
  • Default thresholds: softThresholdTokens = 4000, reserveTokensFloor = 20000. When contextWindow - reserveTokensFloor - softThresholdTokens is crossed, the flush fires once per compaction cycle.
  • The flush turn contains both a system and user prompt instructing the agent to write to memory/YYYY-MM-DD.md and reply with NO_REPLY if nothing needs storing, so users never see the housekeeping.
  • Flushes require the workspace to be writable; sandboxing with workspaceAccess: "ro" skips the event. Check openclaw status or /status if you suspect flushes aren’t running.

Vector memory search and provider selection

OpenClaw’s memory plugin (memory-core by default) indexes Markdown so agents can run semantic recall.

  • Provider auto-selection: the gateway tries local first (if a local model path exists), then openai, gemini, voyage, and mistral. If no credential resolves, memory search stays disabled until configured.
  • Local-first options: set memorySearch.provider = "local" or "ollama" to avoid hosted APIs. Ollama mode hits /api/embeddings on your own node; a placeholder API key satisfies local policy if needed.
  • Extra content: memorySearch.extraPaths lets you fold in Markdown (and, with Gemini embedding 2, select image/audio) outside the default memory/**/*.md tree.
  • Scope control: memory search results only surface in sessions allowed by memorySearch.scope. The default is DM-only; deny group channels unless you intentionally need shared recall.

QMD sidecar for large memory archives

For teams who need richer ranking or multimodal recall, OpenClaw can hand off indexing to QMD, an open-source Bun sidecar combining BM25 + vectors + reranking.

  • Opt in with memory.backend = "qmd". OpenClaw manages the sidecar under ~/.openclaw/agents/<id>/qmd/, handles qmd update/qmd embed, and retries with the builtin SQLite indexer if QMD fails.
  • Requirements: install the qmd CLI separately, make sure Bun and an SQLite build with extension support exist (Homebrew sqlite works), and run on macOS/Linux or WSL2.
  • QMD collections can include MEMORY.md, memory/**/*.md, extra directories, and even sanitized session transcripts when memory.qmd.sessions.enabled = true.
  • Searches run via qmd search --json (default) with fallback to qmd query when a build rejects flags. First query may download GGUF models; OpenClaw sets XDG_CONFIG_HOME/XDG_CACHE_HOME automatically so caches stay agent-specific.

Operator playbook

  1. Treat memory files like code: version them or back them up along with the workspace so you can recover from disk issues.
  2. Audit silent flushes: enable verbose mode or watch openclaw status --all so you know when compaction + flush cycles trigger.
  3. Keep embeddings affordable: if hosted embeddings are too expensive, pin memorySearch.provider = "local" or use Ollama; the docs emphasize that remote providers will bill against their respective keys.
  4. Limit exposure: tighten memorySearch.scope (deny group channels) and store long-term notes only in private sessions to prevent accidental disclosure.
  5. Warm QMD if enabled: run qmd update && qmd embed under the agent’s XDG dirs after upgrades so the first live query doesn’t block an automation.

Conclusion

The newest OpenClaw docs make memory management straightforward: keep Markdown files tidy, let the automatic flush fire before compaction, configure vector search with explicit providers, and graduate to QMD when you need richer recall. With those controls in place, long-running agents keep their context without leaking secrets or exhausting embedding budgets.

FAQ

How do I keep OpenClaw from losing context mid-project?

Ensure the workspace is writable so the pre-compaction flush can write to memory/YYYY-MM-DD.md, and remind agents to append important details rather than expecting latent recall. Compaction plus memory flush is the safety net, not a substitute for explicit notes.

Can I run memory search without paying for hosted embeddings?

Yes. Set memorySearch.provider = "local" (node-llama-cpp) or "ollama" to stay on-device. You can still add extraPaths for shared Markdown while avoiding OpenAI/Gemini billing.

When should I enable the QMD backend?

Use QMD when your memory set grows beyond simple Markdown embeddings—e.g., tens of thousands of snippets, multi-GB archives, or when you need BM25 + reranking for higher precision. OpenClaw automatically falls back to the builtin indexer if QMD is unavailable, so it’s safe to experiment.

AI RECEPTIONIST

The simplest way to never miss a customer — phone, email, SMS, or chat

PhoneEmailSMSLive Chat

Solvea answers every conversation across every channel — set up in minutes with no code, templates included.

  • Works 24/7 without breaks or overtime
  • No-code setup with ready-to-use templates
  • Connects to the tools you already use
  • Omnichannel — one agent, every touchpoint
Try for free

No card required