MCP Directory

How does MCP actually work? The request flow in plain English

Strip away the buzzwords and MCP is almost boringly simple: three parts, one connection, four steps. Once you see the flow, every 'why won't it connect' problem becomes obvious.

Hua·June 27, 2026·5 min read·Updated June 27, 2026
Colorful metallic construction with straight beams and red spheres on pavement in town
Photo by Brett Sayles on Pexels

MCP works like this: an AI host (Claude, Cursor) runs a client that opens a connection to one or more separate server programs; on connect it asks each server "what tools do you have?", and from then on the model can call those tools and get results back — all over JSON-RPC, a plain request/response format. That's the whole thing. Everything below is just detail.

The 30-second version

Three parts, and people constantly conflate them:

  • The host is the app + the model (Claude Desktop, Cursor). It decides when to use a tool.
  • The client lives inside the host and speaks the protocol. It's the plumbing.
  • The server is the separate program exposing tools — the thing you install.

One host, many servers. The host talks to each server through its client. That's the entire topology.

Step by step: what happens when you ask

Say you ask Claude "what tables are in my database?". Here's the real sequence:

  1. Discovery (on startup). When the client connects to a server, it sends a tools/list request. The server replies with its tools — each a name, a description, and an input schema. This is why a server that connects but shows "no tools" is a different problem from one that won't connect at all.
  2. The model decides. Claude reads those tool descriptions and decides a database tool fits. It doesn't run anything yet — it proposes a tool call with arguments.
  3. The tool call. The client sends a tools/call request to the right server with those arguments. The server does the work (queries the DB) and returns the result.
  4. The model answers. The result goes back to the model, which turns it into your answer.

The key insight: the model is choosing from tool descriptions under load. That's why piling on servers backfires — more similar-looking tools means worse choices. It's the whole reason the tool budget exists.

Why it (mostly) runs locally

About 90% of servers use stdio transport — the server runs as a local process and the client talks to it over standard input/output. No network, nothing leaves your machine. The remaining ~10% are remote (HTTP/SSE) servers you reach by URL.

This matters more than it sounds: for a local stdio server, a connection failure is never a network problem — it's the process crashing. Which leads to the practical payoff.

What this means when something breaks

Because the flow is this simple, every failure maps to a step:

Understand the four steps and you stop guessing. The log tells you which step failed; the fix follows from there.

Want the vocabulary nailed down? The MCP glossary defines every term used here in one sentence each.

FAQ

How does MCP work in simple terms?

An AI host (like Claude or Cursor) connects to separate server programs, asks each one what tools it has, and then lets the model call those tools and use the results — all over a simple JSON-RPC request/response connection.

What protocol does MCP use under the hood?

JSON-RPC 2.0. That's why MCP error codes look the way they do — e.g. -32000 (connection closed) and -32601 (method not found) are JSON-RPC conventions.

Does an MCP server run locally or in the cloud?

Usually locally. About 90% of servers use stdio transport and run as a process on your own machine; only ~10% are remote HTTP/SSE servers you connect to by URL.

Put this into practice

Browse MCP servers by capability, or check your own setup's tool budget and security.

More essays