Local-first memory · MCP-native · Apache 2.0

Your AI agent forgets every session. PMB gives it memory, on your disk.

Decisions, lessons and project facts live in one SQLite file you own. Fed back to Claude Code, Cursor, Codex and Zed through MCP. Offline, no API keys, no cloud, recall in ~35 ms.

$pip install pmb-ai

Get started See your memory Star on GitHub

Works with your agent · one command

100% on your machineNo API keysNo cloud, no telemetryApache-2.0, open source

How it works

Memory that doesn't wait to be asked

Hooks inject the right memory before the model thinks, and journal the agent's work after, no LLM call on the read path, no tool the agent has to remember to call.

1 · Any agent records

__claude__Claude Code

__cursor__Cursor

__openai__Codex

writes to memory

PMB · one local memory

recalls what matters

2 · Surfaced before it answers

lesson0.94

file0.88

decision0.81

01 · Read4–16 ms

Auto-recall on every prompt

Every message is classified in sub-millisecond; the matching lessons, decisions and project overview are fetched for the agent before it reasons.

02 · Write< 1 ms

Sub-millisecond async writes

The MCP tool returns instantly. SQLite first; the embed and LanceDB vector insert run on a background thread, never blocking the turn.

03 · Fuse94.5% recall@10

Hybrid recall, ranked

BM25 + dense vectors + entity graph + optional rerank, fused with Reciprocal-Rank-Fusion. One call returns the right thing, ranked.

04 · Learnhonest follow-rate

Lessons that earn their place

Every rule is scored by whether the agent actually follows it. Useful ones get starred; ignored ones are flagged dead, so you prune what doesn't help.

The Map · live entity graph

Your memory, as a graph you can explore

Every fact, decision, lesson, file and entity becomes a node, color-coded by type, sized by importance. Hover one to dim the rest, light up its neighbors, and read the full memory chunk.

MapTimelineOverviewLessons

0 entities · 0 connections · 8 clusters

drag · scroll · hover

The Timeline · git-graph journal

Every decision, lesson and commit, newest first

One lane per project, nodes color-coded by event type, connected by soft curves. The same journal that ships in the dashboard, written automatically as you work.

MapTimelineOverviewLessons

Type

Not a mockup

This is the actual dashboard

A local web app served from your machine. The Map and Timeline above are live recreations, here is the real thing, rendering one project's memory.

MapTimeline

5,229 entities

PMB dashboard — the Map: a force-directed galaxy of 5,229 memory entities across 149 clusters

PMB dashboard — the Timeline: a git-graph journal of decisions, lessons and commits, newest first

The Map · 65,005 connections across 149 clusters, color-coded by kind

Why it matters

What changes when your agent remembers

Not features, outcomes. This is what persistent memory actually does to your day.

Stop re-explaining your project

Every session starts already knowing your decisions, conventions and the bug you hit last Tuesday. No more pasting the same context into a fresh chat.

Switch tools without losing context

Claude Code, Cursor, Codex and Zed all read the same memory. Your context follows you, not your editor, so changing agents costs nothing.

Memory you can actually trust

PMB scores whether each lesson gets followed and flags the dead ones. It tells you when a memory isn't helping, so your context stays honest, not bloated.

Watch it compoundSession 1 of 5

Day one: a blank slate

The agent starts with nothing. You explain your project, your conventions, the bug you hit last Tuesday, and at the end of the session it all evaporates.

entities

lessons

recall hits

S1S2S3S4S5

Quickstart

Seven commands, then just talk to your agent

No account, no keys, nothing leaves your machine. Inspect everything from the terminal, or open the dashboard.

zsh · ~/pmb

pmb recall

fix the pricing bug in checkout⏎

never lower NEGOTIATE/SKIP under 25%

lesson · PMB

0.94

verdict-policy.ts · line 142

file · opened Tue 14:32

0.88

use RRF over a learned weight

decision · 4 days ago

0.81

3 of 41 memories · fused in 35 mshybrid recall

Integrate

One command wires your agent to MCP

Everything runs over stdio, the server is a child process of your agent. No network, no port, no token.

Claude Code

Rules appended to your agent's config automatically
Point several agents at one shared workspace
Verify the wiring with pmb doctor

Any model · local or hosted

Bring your own model, or run it offline

PMB never calls an LLM on the read path. The optional summarize and graph-extract passes run on whatever you point them at, including a fully local Ollama. Your memory stays yours.

graph.extractor = llm:ollama·run extraction and summaries on a local model, 100% offline, zero API keys.

Quickstart

Running in 60 seconds

Three commands, no account, no config. Then just work the way you already do.

Install

One pip install. Pure Python, runs on macOS, Linux and Windows.

$pip install pmb-ai

Connect your agent

Wires PMB into your agent over MCP. Swap in cursor, codex, zed, and more.

$pmb connect claude-code

Just talk to it

Work as usual, PMB records and recalls automatically. Open the dashboard any time to explore.

$pmb dashboard

Architecture

Files on your disk, all the way down

Every event lives in SQLite; vectors live in LanceDB next to it. Copy them anywhere with cp. No server to trust.

Benchmarks

Fast, local, and honest about it

Every number here is measured on PMB's own engine and reproducible from the repo. No cloud, no LLM in the read path, no per-query cost.

94.6%recall@10 (LoCoMo)

88msp50 recall at 2k memories

<1msasync write, never blocks

$0per recall, fully local

73.1%answer accuracy (J-score, LoCoMo)

Retrieval quality (recall@k)

MRR 0.774 · nDCG@10 0.816

LoCoMo-10 · 997 questions · no LLM grader · cache off

Recall latency vs memory size (p50 / p95)

p50p95

Warm daemon, cache off, local CPU. Real ~100-memory workspace: p50 24 ms. Cached: ~0.15 ms.

Radically honestIt tells you when a memory isn't helping

Every lesson carries a surface_id. PMB tracks whether the agent actually followed it, confirmed or auto-detected from activity. Rules that get ignored are flagged dead. The ones that earn their place are starred. No vanity metrics.

usefulnever lower NEGOTIATE/SKIP under 25%

usefulpmb warmup before first recall

unverifiedprefer tree-sitter for TS indexing

deadalways rerun the full suite on edit

Under the hood

Built on boring, durable pieces

No exotic infrastructure. Local files and well-worn libraries, the kind you can still open in five years.

Memory hygiene

It tends itself

A year in, recall is still sharp. Memory decays, archives, and dedupes on its own, and never deletes anything behind your back.

Write

Active

Read

Decay

Compact

Archived

You

Daemon

Memory flows left to right and tends itself. Hover a stage to follow the path.

one SQLite file

FAQ

Straight answers

Does my code or data ever leave my machine?

No. Everything lives in a local SQLite file with vectors in LanceDB right next to it. There are no network calls on the read path, no account and no telemetry, ever. Unplug the internet and it still works.

How is this different from RAG or a vector database?

Two ways. Recall is hybrid, BM25 plus dense vectors plus an entity graph, fused and ranked. And it's automatic: the right memory is injected before the model thinks. You don't build a pipeline or hope the agent remembers to call a tool.

Will it slow my agent down?

No. Recall lands in about 35 ms and writes return in under a millisecond, the embedding and vector insert happen on a background thread, so the turn is never blocked.

Which agents and operating systems are supported?

Any MCP-aware agent: Claude Code, Cursor, Codex, Zed, Windsurf and more, wired in with one command. PMB is pure Python and tested on macOS, Linux and Windows.

What if a memory is wrong or unhelpful?

PMB scores whether each lesson actually gets followed and flags the dead ones so you can prune them. It's the rare tool that tells you when its own memory isn't earning its place.

Is it really free?

Yes. Apache-2.0, open source, free forever. No paid tier, no seats, no telemetry. You own the file and the code.

Apache 2.0 · 100% offlineGive your agent a
memory it keeps.

$pip install pmb-ai

Star on GitHub

Free & open source forever·Star the repo·Open an issue·Contribute