[go: up one dir, main page]

Local-first memory · MCP-native · Apache 2.0

Your AI agent forgets every session. PMB gives it memory, on your disk.

Decisions, lessons and project facts live in one SQLite file you own. Fed back to Claude Code, Cursor, Codex and Zed through MCP. Offline, no API keys, no cloud, recall in ~35 ms.

$pip install pmb-ai
Works with your agent · one command
100% on your machineNo API keysNo cloud, no telemetryApache-2.0, open source
How it works

Memory that doesn't wait to be asked

Hooks inject the right memory before the model thinks, and journal the agent's work after, no LLM call on the read path, no tool the agent has to remember to call.

1 · Any agent records
__claude__Claude Code
__cursor__Cursor
__openai__Codex
writes to memory
PMB
PMB · one local memory
recalls what matters
2 · Surfaced before it answers
lesson0.94
file0.88
decision0.81
01 · Read4–16 ms

Auto-recall on every prompt

Every message is classified in sub-millisecond; the matching lessons, decisions and project overview are fetched for the agent before it reasons.

02 · Write< 1 ms

Sub-millisecond async writes

The MCP tool returns instantly. SQLite first; the embed and LanceDB vector insert run on a background thread, never blocking the turn.

03 · Fuse94.5% recall@10

Hybrid recall, ranked

BM25 + dense vectors + entity graph + optional rerank, fused with Reciprocal-Rank-Fusion. One call returns the right thing, ranked.

04 · Learnhonest follow-rate

Lessons that earn their place

Every rule is scored by whether the agent actually follows it. Useful ones get starred; ignored ones are flagged dead, so you prune what doesn't help.

The Map · live entity graph

Your memory, as a graph you can explore

Every fact, decision, lesson, file and entity becomes a node, color-coded by type, sized by importance. Hover one to dim the rest, light up its neighbors, and read the full memory chunk.

MapTimelineOverviewLessons
0 entities · 0 connections · 8 clusters
drag · scroll · hover
The Timeline · git-graph journal

Every decision, lesson and commit, newest first

One lane per project, nodes color-coded by event type, connected by soft curves. The same journal that ships in the dashboard, written automatically as you work.

MapTimelineOverviewLessons
Type
Not a mockup

This is the actual dashboard

A local web app served from your machine. The Map and Timeline above are live recreations, here is the real thing, rendering one project's memory.

MapTimeline
5,229 entities
PMB dashboard — the Map: a force-directed galaxy of 5,229 memory entities across 149 clustersPMB dashboard — the Timeline: a git-graph journal of decisions, lessons and commits, newest first

The Map · 65,005 connections across 149 clusters, color-coded by kind

Why it matters

What changes when your agent remembers

Not features, outcomes. This is what persistent memory actually does to your day.

Stop re-explaining your project

Every session starts already knowing your decisions, conventions and the bug you hit last Tuesday. No more pasting the same context into a fresh chat.

Switch tools without losing context

Claude Code, Cursor, Codex and Zed all read the same memory. Your context follows you, not your editor, so changing agents costs nothing.

Memory you can actually trust

PMB scores whether each lesson gets followed and flags the dead ones. It tells you when a memory isn't helping, so your context stays honest, not bloated.

Watch it compoundSession 1 of 5

Day one: a blank slate

The agent starts with nothing. You explain your project, your conventions, the bug you hit last Tuesday, and at the end of the session it all evaporates.

12
entities
1
lessons
·
recall hits
S1S2S3S4S5
Quickstart

Seven commands, then just talk to your agent

No account, no keys, nothing leaves your machine. Inspect everything from the terminal, or open the dashboard.

zsh · ~/pmb
pmb recall
L
never lower NEGOTIATE/SKIP under 25%
lesson · PMB
0.94
F
verdict-policy.ts · line 142
file · opened Tue 14:32
0.88
D
use RRF over a learned weight
decision · 4 days ago
0.81
3 of 41 memories · fused in 35 mshybrid recall
Integrate

One command wires your agent to MCP

Everything runs over stdio, the server is a child process of your agent. No network, no port, no token.

Claude Code

  • Rules appended to your agent's config automatically
  • Point several agents at one shared workspace
  • Verify the wiring with pmb doctor
Any model · local or hosted

Bring your own model, or run it offline

PMB never calls an LLM on the read path. The optional summarize and graph-extract passes run on whatever you point them at, including a fully local Ollama. Your memory stays yours.

graph.extractor = llm:ollama·run extraction and summaries on a local model, 100% offline, zero API keys.
Quickstart

Running in 60 seconds

Three commands, no account, no config. Then just work the way you already do.

1

Install

One pip install. Pure Python, runs on macOS, Linux and Windows.

$pip install pmb-ai
2

Connect your agent

Wires PMB into your agent over MCP. Swap in cursor, codex, zed, and more.

$pmb connect claude-code
3

Just talk to it

Work as usual, PMB records and recalls automatically. Open the dashboard any time to explore.

$pmb dashboard
Architecture

Files on your disk, all the way down

Every event lives in SQLite; vectors live in LanceDB next to it. Copy them anywhere with cp. No server to trust.

Your agentMCP · stdioMCP server29 tools · prepare()Enginehybrid routerHybrid recallBM25 + vector + graph · 35msAsync writeembed queue · sub-ms returnSQLiteLanceDBON YOUR DISK
Benchmarks

Fast, local, and honest about it

Every number here is measured on PMB's own engine and reproducible from the repo. No cloud, no LLM in the read path, no per-query cost.

94.6%recall@10 (LoCoMo)
88msp50 recall at 2k memories
<1msasync write, never blocks
$0per recall, fully local
73.1%answer accuracy (J-score, LoCoMo)

Retrieval quality (recall@k)

MRR 0.774 · nDCG@10 0.81668.485.089.494.6k=1k=3k=5k=10

LoCoMo-10 · 997 questions · no LLM grader · cache off

Recall latency vs memory size (p50 / p95)

p50p95
100 ms101005001k2k

Warm daemon, cache off, local CPU. Real ~100-memory workspace: p50 24 ms. Cached: ~0.15 ms.

Radically honestIt tells you when a memory isn't helping

Every lesson carries a surface_id. PMB tracks whether the agent actually followed it, confirmed or auto-detected from activity. Rules that get ignored are flagged dead. The ones that earn their place are starred. No vanity metrics.

usefulnever lower NEGOTIATE/SKIP under 25%
usefulpmb warmup before first recall
unverifiedprefer tree-sitter for TS indexing
deadalways rerun the full suite on edit
Under the hood

Built on boring, durable pieces

No exotic infrastructure. Local files and well-worn libraries, the kind you can still open in five years.

Memory hygiene

It tends itself

A year in, recall is still sharp. Memory decays, archives, and dedupes on its own, and never deletes anything behind your back.

Write
Active
Read
Decay
Compact
Archived
You
Daemon

Memory flows left to right and tends itself. Hover a stage to follow the path.

one SQLite file
FAQ

Straight answers

Does my code or data ever leave my machine?

No. Everything lives in a local SQLite file with vectors in LanceDB right next to it. There are no network calls on the read path, no account and no telemetry, ever. Unplug the internet and it still works.

How is this different from RAG or a vector database?

Two ways. Recall is hybrid, BM25 plus dense vectors plus an entity graph, fused and ranked. And it's automatic: the right memory is injected before the model thinks. You don't build a pipeline or hope the agent remembers to call a tool.

Will it slow my agent down?

No. Recall lands in about 35 ms and writes return in under a millisecond, the embedding and vector insert happen on a background thread, so the turn is never blocked.

Which agents and operating systems are supported?

Any MCP-aware agent: Claude Code, Cursor, Codex, Zed, Windsurf and more, wired in with one command. PMB is pure Python and tested on macOS, Linux and Windows.

What if a memory is wrong or unhelpful?

PMB scores whether each lesson actually gets followed and flags the dead ones so you can prune them. It's the rare tool that tells you when its own memory isn't earning its place.

Is it really free?

Yes. Apache-2.0, open source, free forever. No paid tier, no seats, no telemetry. You own the file and the code.

Apache 2.0 · 100% offlineGive your agent a
memory it keeps.
$pip install pmb-ai
Star on GitHub
Free & open source forever·Star the repo·Open an issue·Contribute