Memory MCP: how to give an AI agent a place to remember
TL;DR
- A memory MCP server is an external service that stores an agent's information and exposes write and read memory operations over the Model Context Protocol.
- The term does not refer to one product. It spans the official knowledge-graph reference server, local or file-backed community servers, and hosted or managed memory APIs, as seen across the official MCP servers repository and registries like mcpservers.org, mcp.so, and PulseMCP.
- Choose a server by five questions: where data lives, what memory operations it actually performs, how it scopes users and projects, what backing store it uses, and what licensing or operational model it requires.
- The best test is your own workload: write in one session, read in another, check dedup, check relevance, and verify tenant isolation. See /research/evaluation/evaluating-agent-memory.
An AI agent forgets everything when its session ends. A memory MCP server is a standard way to fix that.
MCP — the Model Context Protocol — is an open protocol that lets an AI application connect to external tools and data through servers. Source: modelcontextprotocol.io.
A memory MCP server is an MCP server whose job is to store information outside the current session and expose memory operations back to the model as tools. The practical result is simple: write a fact now, retrieve it later, and stop starting every session cold.
That definition matters because "memory MCP" gets used loosely. Sometimes people mean the official reference memory server. Sometimes they mean a local file-backed server. Sometimes they mean a hosted memory API exposed through MCP. The category is broader than any one implementation.
Why AI agents need persistent memory
An agent can only act on what is in its current context and what its tools can fetch. The current context is bounded. It also disappears at the end of the session. That creates a basic operational problem: anything worth keeping must be written somewhere durable outside the model.
A memory server is that external place.
The cleanest way to think about it:
- working memory lives in the current session
- long-term memory lives outside the session
- MCP is one way to let the agent reach that long-term layer
The distinction is not mystical. It is operational. For background, see /research/memory-science/kinds-of-memory and /research/memory/cognitive-models/episodic-semantic-memory.
It is worth being plain about the hype, too. Search demand for "memory MCP" is steady rather than spiking. That does not make the category unimportant — it means the explanation is still up for grabs. The cold-start problem is real whether or not the term is fashionable.
What counts as a memory MCP server
The MCP specification defines how a client connects to servers. It does not force one memory design. That is why the term has no single owner.
Today, the category clearly includes at least three buckets.
1) Official knowledge-graph reference server
The official MCP servers repository includes a reference "memory" server built around a local knowledge-graph model. It is the cleanest example of memory over MCP in the reference ecosystem, and it shows two things: memory is a first-class use case in MCP, and "memory" does not have to mean one storage pattern.
2) Local or file-backed servers
Community servers often persist to local files or an embedded store. These fit teams that want simple local control, easy inspection, or no dependency on a hosted API.
You can see the breadth of the ecosystem in public registries such as mcpservers.org, mcp.so, and PulseMCP. Use those as discovery surfaces, not as rankings.
3) Hosted or managed memory APIs
Some memory servers are adapters over a managed backend. The MCP server exposes tools to the client, while storage, retrieval, and tenant handling live in the hosted service. This bucket usually matters when you need multi-user scoping, centralized operations, or a shared memory layer across agents and sessions.
How to choose a memory MCP server
Do not start with vendor names. Start with five questions.
1) Where does your data live?
This is the first cut.
- Local or file-backed means data lives on your machine or in infrastructure you operate.
- Hosted or managed means the server or its backing service stores data remotely.
Neither is universally better. Local control may fit prototypes, personal workflows, or strict deployment constraints. Managed services may fit production systems that need shared access or operational simplicity.
2) What memory operations does it actually perform?
A lot of servers can store and search. That is the minimum. The more useful question is what else they do. A neutral vocabulary helps here. Useful operation names include:
- write-with-importance
- consolidate
- dedup
- link
- decay/forget
- read-with-scoring
- feedback
- multi-tenant
- cross-session
These are neutral capability labels, not a standard defined by the MCP spec. You do not need every operation. You do need to know which ones your workload requires.
- If repeated facts pile up, dedup matters.
- If retrieval quality matters, read-with-scoring matters.
- If information should fade or be retired, decay/forget matters.
- If one account must not leak into another, multi-tenant scoping matters.
If a server only does basic write and read, say that plainly — basic is fine for some workloads. And never assume a name on a feature list equals behavior. Inspect the tools the server actually exposes to its MCP client.
3) How does it scope users and projects?
This is where many memory demos break in practice.
- Can it separate one user from another?
- Can it separate one project from another?
- Can the same agent reach the right memory without crossing boundaries?
If the answer is vague, keep looking. Multi-tenant scoping is not an edge case. It is the difference between a toy and an operational system.
4) What is the backing store and retrieval model?
Different servers use different internal representations. Common patterns include knowledge graph, vector retrieval, files, and managed-service abstractions. Knowledge graphs suit explicit entity-and-relation structure; vector retrieval suits fuzzy semantic recall; files suit simple local persistence. You do not need a universal winner — you need a fit for your retrieval pattern and operational constraints.
If you are running more than one MCP server, it helps to think about how memory fits into the larger server topology. For that, see /research/mcp/federated/architecture.
5) What is the licensing and operational model?
This question is practical, not legalistic.
- Is there public code? If yes, what is its license?
- Do you self-host it, or is it a managed service?
- What part is public, and what part is private?
That last point matters because "open" often gets used imprecisely. Treat each component separately.
Memory MCP quick start
If you already know you want memory over MCP, the fastest path is this:
- Pick a server using the five questions above.
- Add it to your MCP client config.
- Confirm the client exposes the memory tools.
- Write a fact in one session.
- Start a new session and read it back.
The exact config depends on the client and the server. A minimal MCP client entry has this shape:
{
"mcpServers": {
"memory": {
"command": "your-memory-server-command",
"args": ["your-server-args"]
}
}
}The point of the snippet is the workflow, not the exact command: your client starts the server, the client sees the server's tools, and your agent can call those tools to write and retrieve memory. Once the tools appear, run the verification below before you trust anything else.
If you want a worked MCP server reference, see /api/mcp-server. For Claude-specific setup, use /api/claude.
How to know it works — without relying on benchmarks
Public benchmarks exist, but agent-memory evaluation is still under active methodology debate — prompt sensitivity, dataset contamination, and judge variance all move the scores. That is a good reason to test your own workload before you standardize on a server. A short, concrete test plan is enough.
Write, then read across sessions
Write a fact that is specific and easy to query later — for example, the deploy target is Railway and the team never pushes to main on Fridays. Start a fresh session. Ask for it back. Confirm the answer came from the memory tool, not from the active prompt. If this fails, nothing else matters.
Write near-duplicates and check dedup
Add similar facts with small wording changes. Then inspect what comes back. Does the server keep noisy copies, merge them, or return the most useful form?
Check relevance scoring on a real query
Do not ask a toy query that exactly matches the written text. Ask the kind of question your agent will ask in production. Then inspect whether retrieval brings back the right memory.
Check tenant isolation
Create two separate identities, users, or projects. Write facts under each. Then confirm that one cannot read the other — even when both connect through the same MCP endpoint. That test is mandatory for any shared system.
For a broader evaluation frame, see /research/evaluation/evaluating-agent-memory.
Is Mnemoverse a memory MCP server?
Yes. Mnemoverse fits the hosted or managed memory API bucket — one option in the managed group, not the definition of the group.
The public MCP server client is @mnemoverse/mcp-memory-server, with source on GitHub under MIT. The core Mnemoverse engine is private and proprietary — not open-source or source-available.
As a concrete worked example, that client launches over npx; substitute any other server you picked with the rubric:
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "@mnemoverse/mcp-memory-server@latest"],
"env": { "MNEMOVERSE_API_KEY": "mk_live_YOUR_KEY" }
}
}
}If you want the product-specific reference, see /api/mcp-server. For the broader memory category map around it, see /research/ai-memory-landscape-2026 and /research/market-landscape.
Common questions
What is a memory MCP server?
A memory MCP server is an external service that stores an agent's information and exposes memory operations to the model over the Model Context Protocol. In practice, the agent can write a fact in one session and read it back later instead of starting cold. See the MCP specification and the official reference servers.
How do I add memory to an AI agent over MCP?
Pick a memory server, add it to your MCP client configuration, confirm the client exposes the memory tools, write a fact in one session, then start a new session and read it back. MCP is the protocol that lets a client connect to external tools and data through servers: modelcontextprotocol.io.
What is the best memory MCP server?
There is no single best memory MCP server for every workload. Choose by five questions: where the data lives, what memory operations it performs, how it scopes users and projects, what backing store and retrieval model it uses, and what licensing and operational model it offers. The category includes the official knowledge-graph reference server, local or file-backed servers, and hosted or managed memory APIs.
What is the difference between working memory and long-term memory for an AI agent?
Working memory is the limited context available inside the current session. Long-term memory is durable state stored outside that session and retrieved later when needed. A memory MCP server is one way to provide that long-term layer. For background, see /research/memory-science/kinds-of-memory and /research/memory/cognitive-models/episodic-semantic-memory.
Where does a memory MCP server store data?
It depends on the server. Some store data locally in files or an embedded system. The official reference memory server uses a local knowledge-graph approach in the MCP servers repository. Others use hosted or managed APIs that handle storage and retrieval remotely. You can see the category breadth in registries such as mcpservers.org, mcp.so, and PulseMCP.
Is memory MCP just hype?
The term is broader than any one vendor, and search interest is steady rather than spiking. The underlying problem is still real: agents do not keep durable state across sessions, and context windows are bounded. A memory MCP server gives the agent an external place to write and retrieve information, which is why the category matters independent of the hype.
Related
- /research/ai-memory-landscape-2026
- /research/mcp/federated/architecture
- /api/mcp-server
- /api/claude
- /research/evaluation/evaluating-agent-memory
- /research/market-landscape
- /research/memory-science/kinds-of-memory
- /research/memory/cognitive-models/episodic-semantic-memory
This article is part of the Mnemoverse Library, a research and documentation surface for persistent memory in AI systems.
