
Graphiti MCP
OfficialTemporal knowledge-graph memory for agents: add episodes and search facts over FalkorDB or Neo4j, from Zep.
Add to your client
Copy the config for your MCP client and paste it into its config file.
git clone https://github.com/getzep/graphiti.git && cd graphiti/mcp_server && docker compose upPaste into ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"graphiti-mcp": {
"command": "npx",
"args": [
"-y",
"mcp-remote",
"http://localhost:8000/mcp/"
]
}
}
}Claude Desktop connects to remote servers through the `mcp-remote` proxy (installed on first run via npx). Restart Claude Desktop after saving.
Step-by-step guides: Add to Claude Desktop · Add to Cursor · Add to Windsurf
Before you start
- Docker and Docker Compose (default FalkorDB combined container)
- An LLM provider API key — OPENAI_API_KEY by default; Anthropic, Gemini, Groq, and Azure OpenAI are also supported
- Python 3.10+ and uv only if running the server standalone against an external database
- Neo4j 5.26+ if you choose it over the bundled FalkorDB
About Graphiti MCP
Graphiti's pitch is that agent memory should be a graph with a time axis, not a pile of embeddings. Each add_memory call ingests an episode (plain text, chat messages, or raw JSON); the server extracts entities and facts with an LLM, deduplicates them against the existing graph, and stamps everything with bi-temporal validity. Later, search_memory_facts can filter by valid_at/invalid_at date ranges — so 'what did the user prefer in March' is an actual query, not a hope.
Operationally it is a self-hosted service. docker compose up from mcp_server/ starts a combined FalkorDB + server container with Streamable HTTP at http://localhost:8000/mcp/ (a prebuilt image exists as zepai/knowledge-graph-mcp); Neo4j 5.26+ is the alternative backend via a separate compose file. Configuration layers config.yaml, environment variables, and CLI flags, covering five LLM providers, four embedder providers, and built-in entity types like Preference, Requirement, and Procedure.
The honest trade-off is cost and model quality. Every episode triggers several LLM calls (extraction, dedup, summarization), concurrency is tuned via SEMAPHORE_LIMIT (default 10, sized for OpenAI Tier 3 — expect 429s if you crank it on a free tier), and the README warns that small local models routinely emit malformed JSON that surfaces as ingestion failures. Ollama works, but prefer the most capable model you can run.
Client-side, HTTP-native clients like Cursor and VS Code point straight at the endpoint, while Claude Desktop needs the npx mcp-remote http://localhost:8000/mcp/ shim since it only speaks STDIO. A --group-id flag namespaces graphs, so multiple projects or users can share one server without mixing memories.
Tools & capabilities (13)
add_memoryAdd an episode (text, JSON, or message format) to the graph; queued for async LLM extraction.
add_tripletInsert a single source-fact-target triplet directly, bypassing extraction.
search_nodesSearch entity nodes with entity_types and center_node_uuid filters.
search_memory_factsSearch facts (edges) with edge_types, center node, and valid_at/invalid_at date-range filters.
summarize_sagaGenerate or refresh the running summary of a saga's episodes.
build_communitiesDetect entity communities and produce higher-level community summaries.
get_episode_entitiesTrace provenance: which entities and facts specific episodes created.
delete_entity_edgeDelete an entity edge from the graph.
delete_episodeDelete an episode and cascade-delete the entities/facts it solely created.
get_entity_edgeGet an entity edge by UUID.
get_episodesGet the most recent episodes for a group.
clear_graphClear all data for the given group(s).
get_statusCheck server and database connection status.
What this server can do
Graphiti MCP provides tools for these capabilities — tap one to see every MCP server that does the same:
When to use it
- Persistent agent memory in Cursor: facts learned in one session are queryable in the next via search_memory_facts
- Track user preferences and requirements over time with real validity intervals instead of overwritten embeddings
- Ingest structured CRM or product JSON with add_memory(source='json') and auto-extract entities and relationships
- Audit what the agent 'knows' by tracing which episode created which facts with get_episode_entities
Quick setup
- 1git clone https://github.com/getzep/graphiti.git && cd graphiti/mcp_server
- 2cp .env.example .env and set OPENAI_API_KEY (plus SEMAPHORE_LIMIT for your rate-limit tier)
- 3docker compose up — starts FalkorDB + the MCP server on http://localhost:8000/mcp/ (FalkorDB web UI on :3000)
- 4Point HTTP-capable clients (Cursor, VS Code) at http://localhost:8000/mcp/; for Claude Desktop use `npx mcp-remote http://localhost:8000/mcp/`
- 5Optional: `docker compose -f docker/docker-compose-neo4j.yml up` to run on Neo4j instead of FalkorDB
Security notes
Self-hosted: graph data stays in your FalkorDB/Neo4j instance, but every episode's content is sent to the configured LLM provider (OpenAI by default) for entity extraction. API keys live in a server-side .env file, and anonymous telemetry is on unless you set GRAPHITI_TELEMETRY_ENABLED=false.
Graphiti MCP FAQ
Is the Graphiti MCP server free?
Yes — Apache-2.0 and fully self-hosted, with FalkorDB bundled in the default container. Your real cost is LLM usage: each episode triggers multiple extraction/dedup/summarization calls to your configured provider, which is also why the SEMAPHORE_LIMIT concurrency setting matters.
How is this different from a vector-store memory server?
Graphiti builds an entity/relationship graph with bi-temporal validity rather than storing chunks. You get structured facts with valid_at/invalid_at history, date-range queries, community summaries, and provenance tracing — at the price of running a graph database and paying for extraction LLM calls.
Why am I seeing 429 rate-limit errors during ingestion?
Your SEMAPHORE_LIMIT is too high for your LLM tier. The default of 10 assumes roughly OpenAI Tier 3 (500 RPM); the README suggests 1-2 for free tiers and up to 20-50 for Tier 4. Each episode fans out into several concurrent LLM requests, so lower it until 429s stop.
Alternatives to Graphiti MCP
Compare all alternatives →Official MCP server providing persistent, file-backed knowledge-graph memory across sessions.
Structured step-by-step reasoning tool for breaking problems into revisable thought sequences.
Mem0's local-first memory layer: a Dockerized MCP server plus dashboard that keeps agent memories on your machine.
Compare Graphiti MCP with: