MCP Server & Agent Architecture¶
What's an MCP Server?¶
MCP (Model Context Protocol) is how Claude Code gets extra capabilities beyond reading and writing files. An MCP server gives Claude new tools it can call — in our case, the ability to speak and listen through the ClawMux.
Each agent in the sidebar has its own MCP server running in the background. It's a thin bridge between Claude and the hub — Claude calls a tool like converse("Hello!"), the MCP server forwards it to the hub, and the hub handles all the audio (TTS, playback, recording, STT).
How Agents Connect¶
When you click a voice card in the sidebar:
- The hub creates a working directory and starts Claude Code in a tmux session
- Claude loads the MCP server, which connects to the hub via WebSocket
- The hub sends the
/voice-hubskill to Claude, activating voice mode - Claude sets its project status in the sidebar
- Claude greets you and starts listening
Each agent runs independently — they have their own terminal, their own MCP connection, and their own conversation history.
Available Tools¶
Agents currently have three tools:
- converse — Speak a message to the user. Can optionally wait for a spoken reply. This is the main tool agents use for all voice interaction.
- set_project_status — Update the sidebar to show what project and area the agent is working on (e.g. "voice-hub · frontend"). Agents call this on startup and whenever their context changes.
- voice_chat_status — Check if a browser is connected. Agents call this on startup to make sure someone is listening.
The /voice-hub Skill¶
The /voice-hub skill is a Claude Code slash command that activates voice chat mode. It's sent to each agent automatically when they start up. The skill tells Claude to:
- Check browser connection with
voice_chat_status - Set project status with
set_project_status - Greet the user via
converse - Process spoken requests and respond via
converse - Keep the conversation going until the user says goodbye
See the raw skill: /voice-hub skill
Agent Identity (CLAUDE.md)¶
Every agent gets a CLAUDE.md file in its working directory when it's created. This is how Claude knows its name, personality, and behavior — it's the first thing Claude reads when it starts.
The template is generated by session_manager.py and written to /tmp/voice-hub-sessions/{voice_id}/CLAUDE.md.
See the raw template: CLAUDE.md Template
Adding New Tools¶
The MCP server is defined in hub_mcp_server.py. To add a new tool:
- Define a new function with
@mcp.toolinhub_mcp_server.py— this is what Claude sees and can call - Handle the message in
hub.py— the hub receives it via WebSocket and does the actual work - If the tool needs to update the browser UI, the hub broadcasts a message to the frontend, and
hub.htmlhandles it in thehandleMessagefunction
The pattern is always the same: Claude calls tool → MCP server forwards to hub → hub does work → hub notifies browser.
Session Lifecycle¶
Starting: Click a voice card → hub creates tmux + working directory + MCP config + CLAUDE.md → Claude starts → MCP server connects → /voice-hub skill activates → agent sets project status → agent greets you
Running: You speak → hub transcribes → Claude thinks and works → Claude calls converse() → hub synthesizes speech → you hear it
Ending: Say "goodbye" → agent calls converse(goodbye=true) → hub terminates the session. You can also right-click a card and kill it, or it times out after being idle.
Key Files¶
| File | What it does |
|---|---|
hub.py |
The main server — handles WebSockets, audio routing, session management |
hub_mcp_server.py |
The MCP server that gives Claude voice tools |
session_manager.py |
Creates and manages agent sessions (tmux, working dirs, CLAUDE.md) |
hub_config.py |
Configuration (ports, timeouts, model settings) |
static/hub.html |
The browser interface (HTML + CSS + JS, all in one file) |
.claude/commands/voice-hub.md |
The /voice-hub skill that activates voice mode |