MCP servers for vector & semantic search
Store and query embeddings for RAG and long-term agent memory.
4 servers · Last updated June 17, 2026
TL;DR: These servers give your agent a vector store — upserting embeddings and running similarity search for RAG and persistent memory. They differ on whether they're a dedicated vector DB, a general database with vector support, or an in-process memory store. This is the retrieval half of any RAG stack.
Bottom line: if you only try one, Memory (Knowledge Graph) is the most popular, verified option for this (74,000★). 3 more compared below.
Compare 4 servers
| Server | Transport | Auth | Verified | Stars | Tools for this |
|---|---|---|---|---|---|
| Memory (Knowledge Graph) | Local (stdio) | No auth | 74,000 | create_entities, delete_entities, read_graph +2 | |
| Qdrant MCP Server | Local (stdio) | API key | 1,100 | qdrant-store, qdrant-find | |
| Chroma MCP Server | Local (stdio) | API key | 600 | chroma_list_collections, chroma_create_collection, chroma_peek_collection +4 | |
| Pinecone Developer MCP Server | Local (stdio) | API key | 500 | upsert-records, search-records |
The servers
Official MCP server providing persistent, file-backed knowledge-graph memory across sessions.
Official Qdrant server using a vector collection as semantic memory: store and find embeddings.
Official Chroma server: create collections and run vector, full-text, and metadata search.
Official Pinecone server: manage indexes, upsert/search records, rerank, and search Pinecone docs.
Use these in a stack
FAQ
Dedicated vector DB vs memory server?
Use a dedicated vector store (Chroma, Qdrant, Pinecone) for scale and persistence; a lightweight memory server is fine for small, single-agent context.
What pairs with a vector server for RAG?
A scraping/search server to gather content and an embeddings step — see the RAG agent stack for a ready-made combination.