MCP Server Architecture Diagram – From Spaghetti Integrations to Lego Bricks
MCP Server Architecture Diagram – From Spaghetti Integrations to Lego Bricks
Ever caught yourself writing the 37th custom Slack-to-LLM glue script? Same auth dance, same pagination headache, same prompt hacking to explain the schema to the model. You’re not alone—and there’s a better way.
MCP (Model Context Protocol) turns that spaghetti into Lego bricks. One protocol, any data source, swap pieces at will. Instead of writing N custom integrations with N different auth layers and N different schema mappers, you write one standardized MCP server and plug it into any MCP-compatible host.
This guide walks through the architecture that makes it possible, with real code you can run today.
The MCP Server Architecture Diagram Explained
The MCP architecture has three layers: Host apps (like Claude Desktop) embed MCP clients, each client speaks JSON-RPC 2.0 over stdio or HTTP/SSE to an isolated MCP server. The LLM only sees tool names, descriptions, and JSON schemas—not implementation details.
Stack Walk-Through in 90 Words
The architecture has three layers:
- Host (Claude Desktop, Cursor, etc.) holds the LLM and orchestrates conversations.
- MCP Client lives inside the host—one per server—handles lifecycle, auth retries, and notifications.
- MCP Server is a tiny service you write; it exposes named tools (functions), resources (read-only data), and prompts (templates).
Transport is pluggable: local stdio for zero-config desktop add-ons, HTTP/SSE for remote SaaS deployments. The LLM never sees implementation details—just a tool name, description, and JSON-Schema arguments.
Live Example – Grist Integration in 24 Lines
Let’s build a working MCP server that fetches records from a Grist database. Save this as mcp_server.py:
from mcp.server import Server
from mcp.types import Tool
import asyncio, httpx, os
DOC_ID, TABLE_ID = "myDoc", "myTable"
GRIST_KEY = os.getenv("GRIST_KEY")
server = Server("mymcpshelf")
@server.list_tools()
async def list_tools() -> list[Tool]:
return [Tool(
name="get_grist_records",
description="Fetch all records from a Grist table",
inputSchema={"type": "object", "properties": {}}
)]
@server.call_tool()
async def call_tool(name: str, args: dict):
if name != "get_grist_records":
raise ValueError("Unknown tool")
async with httpx.AsyncClient() as c:
r = await c.get(
f"https://docs.getgrist.com/api/docs/{DOC_ID}/tables/{TABLE_ID}/records",
headers={"Authorization": f"Bearer {GRIST_KEY}"}
)
r.raise_for_status()
return r.json() # { "records": [ … ] }
if __name__ == "__main__":
asyncio.run(server.run())
Install once:
pip install mcp httpx
export GRIST_KEY="your-api-key-here"
python mcp_server.py
That’s it. Twenty-four lines, and you have a production-ready MCP server.
Wire It to Claude Desktop
Open ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows) and add:
{
"mcpServers": {
"grist": {
"command": "python",
"args": ["-m", "mcp_server"],
"env": {
"GRIST_KEY": "your-api-key-here"
}
}
}
}
Restart Claude Desktop. You’ll instantly see a new tool icon in the interface. Ask “List my Grist records” and the model calls your server without any prompt hacking or schema explanation.
MCP Server Architecture Diagram: Sequence Flow
Here’s what happens under the hood:
- Client → MCP Server:
initialize()request via stdio - MCP Server → Client:
initializedconfirmation - Client → MCP Server:
tools/call "get_grist_records" - MCP Server → Grist API:
GET /recordswith bearer token - Grist API → MCP Server: JSON payload with records
- MCP Server → Client:
{result: [ … ]}back through stdio
The beauty of stdio: no ports, no firewall rules, no network configuration. The client spawns your Python process, pipes stdin/stdout, and that’s the entire transport layer.
HTTP/SSE Pattern (Remote MCP Server)
Stdio works great for local tools, but what if you need remote access or want to serve multiple users? Swap stdio for HTTP with Server-Sent Events.
Add this to your mcp_server.py:
from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route
async def handle_sse(request):
async with request.receive():
transport = SseServerTransport("/messages")
await server.run(transport)
app = Starlette(routes=[Route("/sse", endpoint=handle_sse)])
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8080)
Run it:
pip install starlette uvicorn
uvicorn mcp_server:app --host 0.0.0.0 --port 8080
Claude Desktop config for HTTP/SSE:
{
"mcpServers": {
"grist-remote": {
"url": "https://my-mcp-server.example.com/sse",
"transport": "sse"
}
}
}
When to Use HTTP/SSE
- Multi-tenant SaaS deployments – One server, many users
- OAuth2 flows – Need browser redirects for authorization
- Centralized logging – Track usage across all clients
- Cross-network access – Dev machine → cloud database
Real-World HTTP/SSE Examples
- Stripe MCP Server – Handles OAuth2 for payment processing, serves multiple merchants
- Notion MCP Server – Multi-workspace, multi-user access with centralized rate limiting
- AWS Bedrock Knowledge Bases – Cloud-native, IAM-secured document retrieval
Multi-Tool vs Single-Purpose Servers
One of the biggest architecture decisions: should your MCP server do one thing or many things?
| Single-Purpose Server | Multi-Tool Server |
|---|---|
github-issues-server (1 tool: create_issue) | github-server (8 tools: repos, issues, PRs, releases…) |
| ✅ Least privilege (narrow token scopes) | ✅ Fewer config entries |
| ✅ Easy to audit & test | ✅ Share auth tokens & rate-limits |
| ✅ Swap implementations without breakage | ⚠️ Over-permissioned tokens |
| ⚠️ Config file bloat (20+ entries) | ⚠️ Tight coupling (tool changes = reboot) |
The Right Approach
Start single-purpose. If you hit 5+ servers for one service (Slack channels, Slack messages, Slack users…), consolidate into a bounded multi-tool server.
The key word is bounded—all tools should relate to the same service or domain. Don’t mix GitHub + Slack + Stripe in one server just because you can.
Examples from MyMCPShelf
- Single-purpose done right: SQLite MCP – One job: query SQLite databases. No feature creep.
- Multi-tool done right: GitHub Official MCP – 20+ tools, but all GitHub-scoped (repos, issues, PRs, actions).
- Avoid: “Swiss Army knife” servers that mix unrelated APIs and require dozens of permission scopes.
Production Hardening Checklist
Before you deploy an MCP server beyond your laptop, lock down these eight areas:
-
One job per tool – Use action verbs (
create_issue,fetch_records), not nouns. Each tool should do exactly one thing. -
Lock the schema – Use JSON-Schema draft 2020-12 with strict types, enums for allowed values, and
requiredarrays. No free-form strings where enums would work. -
Idempotency keys – Same input → same output. Essential for retry logic. Hash the input args and cache responses for 5 minutes.
-
Validate & sanitize – Never pass raw user strings into SQL queries or shell commands. Use parameterized queries and escape user input.
-
Rate-limit & paginate – Filter at the source, not in the prompt. Return max 100 items, provide pagination tokens for more.
-
Return human summaries – The LLM quotes your response. Return structured data and a natural language summary to reduce hallucinations.
-
Observability – Log every tool call with: timestamp, user ID, tool name, latency, success/failure. Use structured logging (JSON lines).
-
Least-privilege tokens – Your Grist bearer token should have read-only access to one table, not admin access to all docs.
Production Examples
- Security auditing: Semgrep MCP Server scans code for vulnerabilities before execution
- Database safety: PostgreSQL MCP uses read-only connection strings and parameterized queries
Why This Beats “Yet Another API Wrapper”
| Spaghetti Integration | Lego Brick (MCP) |
|---|---|
| N auth layers (Slack OAuth, GitHub token, Stripe key…) | 1 client, pluggable transports |
| N schema mappers (custom JSON → LLM prompt text) | 1 JSON-Schema per tool |
| Prompt hacking (“here’s the API response format…”) | Zero-shot tool use |
| Tightly coupled (host knows about every API) | Language-agnostic micro-service |
| Hard to audit (prompts scattered across codebase) | Built-in request/response tracing |
The difference: separation of concerns. The LLM host doesn’t need to know how to authenticate with Grist, paginate results, or handle rate limits. It just knows there’s a tool called get_grist_records that returns records.
When Grist changes their API, you update one file (mcp_server.py). When you swap Grist for Airtable, you swap one server. The host configuration stays identical.
TL;DR
- MCP = USB-C for AI integrations. One protocol, any data source.
- Three layers: Host ↔ MCP Client ↔ MCP Server (JSON-RPC 2.0).
- Write 20 lines, drop it into Claude Desktop, done.
- Swap transports (stdio ↔ HTTP/SSE) without touching business logic.
- Your 38th integration is just another Lego brick—no more spaghetti.
Next Steps
Ready to build your own MCP servers or explore what’s already available?
-
⭐ Star the official MCP repo to follow protocol updates
-
🚀 Submit your server to awesome-mcp once you’ve built something useful
-
🔍 Browse 180+ ready-made servers organized by category:
- Database Tools – SQLite, PostgreSQL, MongoDB, MySQL
- Communication – Slack, Discord, Telegram, WhatsApp
- File Management – Google Drive, AWS S3, local filesystem
- Developer Tools – GitHub, GitLab, Docker, Kubernetes
- All Servers →
Happy building—keep stacking bricks, not spaghetti. 🧱