Graphiti MCP

Official

Temporal knowledge-graph memory for agents: add episodes and search facts over FalkorDB or Neo4j, from Zep.

Unverified

HTTP (remote)

API key

Python

View repo 28k Website

Add to your client

Copy the config for your MCP client and paste it into its config file.

Install / run

git clone https://github.com/getzep/graphiti.git && cd graphiti/mcp_server && docker compose up

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "graphiti-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "http://localhost:8000/mcp/"
      ]
    }
  }
}

Claude Desktop connects to remote servers through the `mcp-remote` proxy (installed on first run via npx). Restart Claude Desktop after saving.

Step-by-step guides: Add to Claude Desktop · Add to Cursor · Add to Windsurf

Before you start

Docker and Docker Compose (default FalkorDB combined container)
An LLM provider API key — OPENAI_API_KEY by default; Anthropic, Gemini, Groq, and Azure OpenAI are also supported
Python 3.10+ and uv only if running the server standalone against an external database
Neo4j 5.26+ if you choose it over the bundled FalkorDB

About Graphiti MCP

Graphiti's pitch is that agent memory should be a graph with a time axis, not a pile of embeddings. Each add_memory call ingests an episode (plain text, chat messages, or raw JSON); the server extracts entities and facts with an LLM, deduplicates them against the existing graph, and stamps everything with bi-temporal validity. Later, search_memory_facts can filter by valid_at/invalid_at date ranges — so 'what did the user prefer in March' is an actual query, not a hope.

Operationally it is a self-hosted service. docker compose up from mcp_server/ starts a combined FalkorDB + server container with Streamable HTTP at http://localhost:8000/mcp/ (a prebuilt image exists as zepai/knowledge-graph-mcp); Neo4j 5.26+ is the alternative backend via a separate compose file. Configuration layers config.yaml, environment variables, and CLI flags, covering five LLM providers, four embedder providers, and built-in entity types like Preference, Requirement, and Procedure.

The honest trade-off is cost and model quality. Every episode triggers several LLM calls (extraction, dedup, summarization), concurrency is tuned via SEMAPHORE_LIMIT (default 10, sized for OpenAI Tier 3 — expect 429s if you crank it on a free tier), and the README warns that small local models routinely emit malformed JSON that surfaces as ingestion failures. Ollama works, but prefer the most capable model you can run.

Client-side, HTTP-native clients like Cursor and VS Code point straight at the endpoint, while Claude Desktop needs the npx mcp-remote http://localhost:8000/mcp/ shim since it only speaks STDIO. A --group-id flag namespaces graphs, so multiple projects or users can share one server without mixing memories.

Tools & capabilities (13)

add_memory

Add an episode (text, JSON, or message format) to the graph; queued for async LLM extraction.

add_triplet

Insert a single source-fact-target triplet directly, bypassing extraction.

search_nodes

Search entity nodes with entity_types and center_node_uuid filters.

search_memory_facts

Search facts (edges) with edge_types, center node, and valid_at/invalid_at date-range filters.

summarize_saga

Generate or refresh the running summary of a saga's episodes.

build_communities

Detect entity communities and produce higher-level community summaries.

get_episode_entities

Trace provenance: which entities and facts specific episodes created.

delete_entity_edge

Delete an entity edge from the graph.

delete_episode

Delete an episode and cascade-delete the entities/facts it solely created.

get_entity_edge

Get an entity edge by UUID.

get_episodes

Get the most recent episodes for a group.

clear_graph

Clear all data for the given group(s).

get_status

Check server and database connection status.

What this server can do

Graphiti MCP provides tools for these capabilities — tap one to see every MCP server that does the same:

Vector & semantic search

When to use it

Persistent agent memory in Cursor: facts learned in one session are queryable in the next via search_memory_facts
Track user preferences and requirements over time with real validity intervals instead of overwritten embeddings
Ingest structured CRM or product JSON with add_memory(source='json') and auto-extract entities and relationships
Audit what the agent 'knows' by tracing which episode created which facts with get_episode_entities

Quick setup

1git clone https://github.com/getzep/graphiti.git && cd graphiti/mcp_server
2cp .env.example .env and set OPENAI_API_KEY (plus SEMAPHORE_LIMIT for your rate-limit tier)
3docker compose up — starts FalkorDB + the MCP server on http://localhost:8000/mcp/ (FalkorDB web UI on :3000)
4Point HTTP-capable clients (Cursor, VS Code) at http://localhost:8000/mcp/; for Claude Desktop use `npx mcp-remote http://localhost:8000/mcp/`
5Optional: `docker compose -f docker/docker-compose-neo4j.yml up` to run on Neo4j instead of FalkorDB

Security notes

Self-hosted: graph data stays in your FalkorDB/Neo4j instance, but every episode's content is sent to the configured LLM provider (OpenAI by default) for entity extraction. API keys live in a server-side .env file, and anonymous telemetry is on unless you set GRAPHITI_TELEMETRY_ENABLED=false.

Graphiti MCP FAQ

Is the Graphiti MCP server free?

Yes — Apache-2.0 and fully self-hosted, with FalkorDB bundled in the default container. Your real cost is LLM usage: each episode triggers multiple extraction/dedup/summarization calls to your configured provider, which is also why the SEMAPHORE_LIMIT concurrency setting matters.

How is this different from a vector-store memory server?

Graphiti builds an entity/relationship graph with bi-temporal validity rather than storing chunks. You get structured facts with valid_at/invalid_at history, date-range queries, community summaries, and provenance tracing — at the price of running a graph database and paying for extraction LLM calls.

Why am I seeing 429 rate-limit errors during ingestion?

Your SEMAPHORE_LIMIT is too high for your LLM tier. The default of 10 assumes roughly OpenAI Tier 3 (500 RPM); the README suggests 1-2 for free tiers and up to 20-50 for Tier 4. Each episode fans out into several concurrent LLM requests, so lower it until 429s stop.

#knowledge-graph #memory #neo4j #falkordb #agents

Alternatives to Graphiti MCP

Compare all alternatives →

Memory (Knowledge Graph)

AI, Data & Knowledge

74k

Official MCP server providing persistent, file-backed knowledge-graph memory across sessions.

Verified

stdio (local)

No auth

TypeScript

9 tools

Updated 5 months agoRepo

Sequential Thinking

AI, Data & Knowledge

62k

Structured step-by-step reasoning tool for breaking problems into revisable thought sequences.

Verified

stdio (local)

No auth

TypeScript

1 tool

Updated 6 months agoRepo

OpenMemory MCP

AI, Data & Knowledge

60k

Mem0's local-first memory layer: a Dockerized MCP server plus dashboard that keeps agent memories on your machine.

Unverified

SSE (remote)

API key

Python

5 tools

Updated 1 day agoRepo

Compare Graphiti MCP with:

vs Memory (Knowledge Graph)vs Sequential Thinking vs OpenMemory MCP vs Zen MCP Server (now PAL)