How to Test an MCP Server Before You Ship
The Inspector, real tool calls, and a CI smoke test — the three checks that catch what a demo won't.

To test an MCP server, run it under the MCP Inspector first, then call each tool by hand and confirm the output matches the schema you advertised. That two-step catches most breakage before a single client ever connects. Everything below is how to test an MCP server properly: the Inspector, manual tool calls, wiring it into a real client, and a CI check so it stays green.
Most servers you'll test run the same way. Roughly 90% of MCP servers run locally over stdio — a subprocess your client spawns and talks to over stdin/stdout — so the testing loop is fast and offline. Remote servers over HTTP/SSE add a network hop and auth, which I'll flag where it matters.
Start with the MCP Inspector
The MCP Inspector is the fastest way to test an MCP server, and it's the first thing to reach for. It's the official web UI that launches your server, lists its tools/resources/prompts, and lets you fire calls with arbitrary arguments — no client, no config file, no LLM in the loop.
Run it against any stdio server with npx:
npx @modelcontextprotocol/inspector npx -y @modelcontextprotocol/server-filesystem /tmp
That command boots the official Filesystem reference server scoped to /tmp and opens the Inspector in your browser. Click List Tools. If the list is empty or the server dies on startup, you have a wiring bug, not a logic bug — check your entry point and that dependencies actually install. Getting a clean tool list is the real "hello world" here.
For a remote server, point the Inspector at the URL and transport instead of a command, and paste in whatever token the server expects. If auth is wrong you'll see the connection fail before any tool shows up — a useful, early signal.
Call every tool by hand
A green tool list means the server loaded, not that it works. Next, call each tool with real arguments and read the raw result. This is where the actual bugs live.
For every tool, check three things:
- The happy path returns the shape you promised. If your
inputSchemasays a field is a number, pass a number and confirm the response isn't an error. - Bad input fails cleanly. Pass a missing required arg or a wrong type. You want a clear MCP error, not a stack trace or a silent empty result — the model has to be able to recover.
- Descriptions are honest. The tool's
descriptionis the only thing the model reads to decide when to call it. Vague or wrong descriptions cause the model to misfire, and that reads as a "broken" server even when the code is fine.
Do the same pass for resources and prompts if your server exposes them. Testing the GitHub MCP Server is a good example of why this matters — it ships dozens of tools across repos, issues, PRs, and Actions, and you want to exercise the ones you actually depend on, not assume they all behave.
Test it inside a real client
The Inspector proves the protocol works. It can't prove a model will use your server correctly — for that, load it into an actual client. This step catches the class of problems the Inspector is blind to.
The big one is the tool budget. Most coding clients degrade once the total tool count climbs past roughly 40 — Cursor, for instance, only sends about 40 tools to the model — so a server that exposes 30 tools eats most of that budget on its own and starves everything else. See Cursor's tool-limit math for the details. Test with your server enabled alongside the others you actually run, not in isolation.
Add the server to your client config the normal way (the walkthrough in how to add an MCP server covers each client), restart, and confirm:
| Check | What good looks like |
|---|---|
| Server connects | No red status; tools appear in the client's tool list |
| Model discovers tools | It calls your tool unprompted when the task fits |
| Descriptions steer correctly | It picks the right tool, not a neighbor |
| Budget is sane | Total tools stay near or under ~40 |
A clean way to sanity-check discovery: ask the model to do something only your server can do. If it reaches for a doc-lookup server like Context7 instead of yours, your tool descriptions are losing the competition for attention.
Wire up a CI smoke test
To keep a passing server passing, add a CI check that starts the server and lists its tools on every commit. A dependency bump or a refactor can break startup silently; a smoke test turns that into a red build instead of a bug report.
The check doesn't need to be clever. Start the server, send an initialize request, then tools/list, and assert the tool names you expect are present. Because stdio servers are just subprocesses, this runs in a plain GitHub Actions job with no services to spin up.
What to skip: don't try to assert on live model behavior in CI — it's slow, flaky, and non-deterministic. Keep CI to the protocol contract (server boots, tools list, schemas hold) and leave "does the model pick it" to manual client testing. Pin your MCP SDK version too, so a transitive update doesn't silently change your surface.
Security checks you shouldn't skip
Before you ship, test what your server refuses to do — not just what it does. A server that reads files or hits an API is a new attack surface, and the failure mode is quiet: it does exactly what it was asked, just for the wrong caller.
Two concrete tests. First, confirm scope enforcement — the Filesystem server should reject a path outside its allowed directories, so try one and watch it fail. Second, confirm secrets aren't echoed back in tool results or logs. If you're evaluating third-party servers, prefer official ones over community forks and read what actually matters for MCP security before granting write access. When a server misbehaves in a client, the troubleshooting guide is the faster path than guessing.
Once all four checks pass — Inspector, manual calls, real client, CI — the server is genuinely ready. Browse the directory to see how the tools you tested stack up against the alternatives.