~/memento/blogs

I tried every AI memory tool. Here's why I built another one.

AI tools call it 'memory' — but it's a flat vector store with no types, no audit, no decay, no conflict detection. Memento is memory built like infrastructure: typed, audited, decay-aware, local.

by 10 min readlaunchopinionmcp

Memory in every AI tool I've used has the same essential shape: a list of entries, each with a vector. Some tools dress it up — Mem0 tags entries with a user_id and a run_id, Supermemory adds tags, Cursor's memories carried a project scope. Retrieval is cosine similarity over the vectors, possibly filtered by that light metadata. That's the model.

No timestamp weighting on retrieval, so a preference you stated 18 months ago outranks a decision you made yesterday whenever the older entry is lexically closer to your prompt. No notion of kind at retrieval time — preferences, one-time facts, project decisions, and todos all retrieve through the same heuristic. Rarely an event log you can interrogate — when the assistant surfaces a memory you don't remember writing, you usually can't trace it back to the conversation that produced it. Rarely a conflict detector that catches contradictions at write time — when you change your mind, both versions usually stay in the index at full strength, and which one wins next is a coin flip on embedding geometry. (Mem0's graph backend is the partial exception on a couple of these. None of them have all four.)

For an ordinary database we'd never accept this. We'd demand types, indices, audit logs, constraints. We've somehow accepted vector blobs with the word "memory" on the marketing page.

That's what made me build another one.

I'm aware of the optics — "another AI memory server" in May 2026 reads like another React state library in 2017. The genre is saturated, the survivors are scrappy, and most new entrants are derivative. So before I tell you what I built, I want to be honest about what already exists — and where each of them keeps treating memory as a flat string with a vector when it should be something more.

The status quo, tool by tool

ChatGPT Memory. A vector store of facts ChatGPT extracts from your conversations, retrieved by similarity to whatever you're typing now. Zero-config, baked into the product; Anthropic's "Memory Import" launched in March now lets you carry it across to Claude. The architecture beneath the marketing: a flat list with no kind distinction, no per-memory event log, no decay over time. When you correct something, the previous version often stays. The memory lives in OpenAI's account, not on your machine.

Cursor Memory. When it existed, it was a flat collection of facts surfaced to the model on each turn — scoped to project or global, no decay, no audit trail per entry. Cursor removed Memories in v2.1 (late 2025) and pointed everyone at Rules — static instruction files in a project. The migration threads on the Cursor forum are still open as I write this. Whatever the future of the feature, the lesson is durable: when memory lives inside one tool's product roadmap, you're at that roadmap's mercy.

CLAUDE.md, AGENTS.md, .cursorrules, copilot-instructions.md. These aren't memory. They're config files the AI reads on session start. The 2026 convention is "write AGENTS.md, symlink the others to it" — that gives you project-level cross-tool consistency. It also gives you no retrieval, no decay, no conflict detection, no notion of when an entry was last confirmed or last contradicted. It works for "we use pnpm" and "tests live in __tests__/". It does not work for "what we decided about the rate-limiter on Tuesday."

Mem0 (self-hosted), formerly OpenMemory MCP. The closest competitor on the capability axes that matter, and the most serious local-first cross-tool entry in the space. Same MCP delivery, similar client list (Claude Desktop, Cursor, Windsurf, Cline). The gap is real but narrower than for the others: Mem0 has metadata (user, session, tags), so retrieval can filter — that's a partial type system. It has a per-memory history API that records ADD / UPDATE / DELETE events with timestamps — a real audit log. Its graph backend has LLM-based conflict resolution at write time: a resolver model decides whether a new fact obsoletes an existing relationship. What it doesn't have: half-life decay on confidence, a rule-based deterministic conflict detector, or kinds like decision / preference / todo that retrieve with different weights. Architecturally it ships as docker-compose with Qdrant for vectors and Postgres for relational state — a real cloud-style stack running on your machine, which is a different commitment than a single SQLite file in your home directory. (OpenMemory launched as a standalone repo in May 2025 and is now sunset in favor of the unified Mem0 self-hosted server.)

Supermemory. Cloud-hosted MCP memory with tag-based metadata on top of a vector store. "One memory. Every AI tool." Closer in shape to ChatGPT Memory than to a typed, audited system. If the cloud is where you want your memory, it's a clean choice.

Cline's Memory Bank. A methodology, not a memory system. A set of markdown files (projectbrief.md, activeContext.md, …) that Cline reads on session start. Influential template — many homegrown "AI memory" approaches in the wild are reskins of it. Beautiful for a project; not the right shape for personal preferences that travel between projects, and not retrievable as structured records.

Anthropic's Managed Agents memory (April 2026, public beta). Container-scoped vector memory for cloud-hosted agents. Different problem entirely — agent state across runs, not your personal memory across tools.

I'm leaving out the dozen smaller MCP memory servers that have shown up since the protocol stabilised. Most are good first attempts; none have the combination of capabilities I wanted.

What I actually wanted

After cycling through the list, the shape of what I actually wanted got concrete. Five capability requirements:

  1. Typed. Memory should know whether an entry is a fact, a preference, a decision, a todo, or a snippet. Each has different retrieval semantics. Treating them uniformly is the same mistake as a database that stores every row as TEXT.
  2. Audited. Every write should produce an event in an append-only log. If I disagree with what memory holds, I want to see who wrote it, when, and in response to what — not just delete and hope.
  3. Decay-aware. A preference I stated 18 months ago shouldn't outrank a decision I made yesterday. Confidence in old memories should fade unless re-confirmed. The decay should be applied at query time — not baked into the store — so I can tune the half-life without rewriting data.
  4. Conflict-aware. When two memories disagree on the same topic, the store should notice at write time. Silently coexisting is the worst outcome — both versions retain full retrieval weight, and which one wins next is a coin flip.
  5. Durable. The store should survive sessions, vendor changes, machine moves, and time. Not pinned to one assistant, not pinned to one project, not at the mercy of someone's next product roadmap.

And as the delivery mechanism — not the pitch — I wanted all of that running on my machine, in a file I could cp and grep, with no Docker stack, no Postgres, no Qdrant, no cloud account.

What Memento is

Memento is what I built from that list.

It's a typed memory store with an append-only audit log, configurable per-kind confidence decay, and write-time conflict detection. Every memory has a kindfact, preference, decision, todo, or snippet — and a scope (global, per-repo, or per-session). Preferences and decisions start with a topic: value line so contradictions can be parsed. Every state-changing operation produces a MemoryEvent in an append-only log. Effective confidence is computed at query time as stored × decayFactor(now − lastConfirmedAt, halfLife), with the half-life configurable per kind. The conflict detector runs as a post-write hook on every write; it doesn't block, it flags.

Then — and this is the delivery mechanism, not the pitch — Memento ships as a single MCP server (npx @psraghuveer/memento serve) that any MCP-capable assistant can connect to: Claude Code, Claude Desktop, Cursor, GitHub Copilot, Cline, OpenCode, Aider, custom agents. They all read and write the same store. The store is a single SQLite file under your home directory (~/.local/share/memento/memento.db by default); a built-in browser dashboard at npx @psraghuveer/memento dashboard lets you inspect, audit, and curate it. Vector retrieval runs against local embeddings (bge-base-en-v1.5 by default) — no cloud calls for embedding, either.

Total install footprint: one npx command, ~110 MB on disk for the embedding model on first use, no other dependencies. Apache-2.0. Full architecture walkthrough lives in ARCHITECTURE.md; each non-obvious decision is preserved as an ADR.

Honest comparison

The lens shifts: what matters is whether memory is treated as infrastructure, not where it ships.

Typed kindsAuditedDecay-awareConflict-awareLocal + cross-tool
ChatGPT Memory— (OpenAI only)
CLAUDE.md / AGENTS.mdyes (via symlinks)
Mem0 (self-hosted)tags onlyhistory APILLM (graph)yes (via Docker)
Supermemorytags onlycross-tool, cloud
Memento5 kindsevent logper-kindrule-basedyes (SQLite)

ChatGPT Memory and Supermemory operate as vector stores with light metadata. Mem0 is the closest, with a real history API and graph-mode LLM-based conflict resolution; what it lacks is structured kinds with different retrieval semantics, half-life decay, and deterministic rule-based conflict detection. Memento is the only entry that's typed and audited and decay-aware and conflict-aware — and the difference compounds as memory grows past a few dozen entries and starts retrieving stale things.

A long-form per-axis comparison post is on the way; this table is the thumbnail.

Memory you can put in a vial

A hooded figure in a candlelit study draws a silvery glowing thread of memory from their temple with a wand, depositing it into a small glass vial held in their other hand — the metaphor for Memento's pack-installable, transferable typed memory.

Dumbledore's pensieve. Stone basin in the headmaster's office. He pulls silvery strands of memory out of his head with a wand, drops them into the basin, and either re-enters them later or shares them with Harry. Each memory comes out as a discrete, labeled artifact — you can hand one to someone else and they get the actual experience, not a retelling of it.

That's what packs are.

A pack is a YAML bundle of typed, scoped memories. One command installs it:

That writes eleven memories distilled from John Maeda's Laws of Simplicity into your local store — typed as decision or preference, scoped global, each one tagged pack:engineering-simplicity:0.1.0 so its origin survives every future query. Four bundled packs ship: engineering-simplicity, pragmatic-programmer, twelve-factor-app, google-sre. Preview before installing, dry-run uninstalls, idempotent re-installs.

You can also author one from your own store with npx @psraghuveer/memento pack create. Filter by scope, kind, or tag; Memento bundles the matching memories into a YAML you can email, drop on a Gist, or PR into the community registry.

The interesting thing this unlocks isn't the four bundled packs. It's the YAMLs your team is going to write. The stack guides — pnpm + Vitest + Tailwind; Rust + Axum + sqlx; Python + uv + Pydantic. The team conventions — your squad's testing rules, your shop's deployment hygiene, the post-mortems you keep re-citing in chat. Each one is text. Each one is reviewable before it lands. Each one stamps its origin on every memory it installs. Memory you used to keep re-explaining is memory you can hand to someone else.

You might not need this

A short list of cases where Memento is the wrong answer:

  • You just want a scratchpad. If you don't care about typing, audit trails, decay, or conflict detection — you just want an assistant to remember anything at all — Mem0 or Supermemory are simpler choices. The whole point of Memento is the structure; if the structure isn't what you want, the install cost isn't justified.
  • You live in one AI tool. If you use Claude Code and only Claude Code, CLAUDE.md plus the auto-MEMORY.md is genuinely enough. The cost of adding another moving part isn't worth it.
  • You want cloud sync across machines. Memento is local-first. If your memory needs to follow you to a different laptop or a phone, Supermemory or Mem0's cloud variant is the right call. (I have ideas about sync; they are explicitly not v1.)
  • You hate SQLite. I will not litigate this.

The case for Memento is narrow on purpose. If you're outside it, please use what already works for you. I am explicitly not in the business of converting anyone who's happy.

Try it

If the case above sounds like yours:

That command creates the database, runs migrations, walks you through four one-keystroke setup questions, and prints copy-paste MCP-server snippets for every supported client. Restart your client, paste the snippet, and the next session starts with memory.

A reasonable first move once you're set up: install a pack to see typed memory in action immediately.

Then ask your assistant, in any tool you wired up, "What does Maeda's Laws of Simplicity say about hidden complexity?" — it should answer from the eleven memories the pack just installed, no re-explanation needed. That five-second round-trip is what the whole project is about.

share