Deployment Modes Brief¶
Created: 2026-03-03 8:45 PM EST
How ClawMux will run on different devices and network configurations.
The Problem¶
ClawMux currently runs on a single workstation (RTX 3090, Linux) with everything co-located: the hub, agents, TTS (Kokoro), and STT (Whisper). To make ClawMux a public app that works on any device, we need flexible deployment modes that adapt to what hardware and network the user has.
Three Deployment Modes¶
Mode 1: Thin Client (Terminal Mode)¶
The user's device (phone, tablet, laptop) acts purely as a terminal. Everything runs on a remote server.
┌──────────────┐ ┌────────────────────────────┐
│ Phone/iPad │◄───────►│ Remote Server │
│ (browser) │ HTTPS │ Hub + Agents + TTS + STT │
│ audio only │ │ (GPU required) │
└──────────────┘ └────────────────────────────┘
- Device does nothing except stream audio and display the UI
- All compute on the server
- Best for: phones, low-power devices, using ClawMux on the go
- Requires: network connection to server
Mode 2: Split Architecture (Recommended)¶
Hub and agents run locally. TTS and STT run on a remote GPU server.
┌─────────────────────────────┐ ┌──────────────────┐
│ Local Machine │ │ GPU Server │
│ Hub + Agents (Claude Code) │◄────►│ Kokoro TTS │
│ Browser UI │ HTTP │ Whisper STT │
│ No GPU needed │ │ (RTX 3070+) │
└─────────────────────────────┘ └──────────────────┘
- Hub runs on Mac/Linux/Windows, agents run locally
- TTS/STT calls go to remote server over Tailscale or HTTPS
- Just change
WHISPER_URLandKOKORO_URLin config - Best for: users with a Mac and a separate GPU machine
- Requires: network connection to GPU server
Mode 3: Fully Local¶
Everything runs on the user's machine. No network dependency.
┌───────────────────────────────────────┐
│ User's Machine │
│ Hub + Agents + TTS + STT │
│ (Apple Silicon or NVIDIA GPU) │
│ Kokoro via mlx-audio (Mac) │
│ or kokoro-fastapi (NVIDIA) │
│ Whisper via whisper.cpp (Mac) │
│ or faster-whisper (NVIDIA) │
└───────────────────────────────────────┘
- Everything on one machine, works offline
- Mac: mlx-audio for TTS (~3x realtime on M4+), whisper.cpp with CoreML for STT
- NVIDIA: kokoro-fastapi for TTS, faster-whisper for STT
- Best for: privacy-focused users, offline use, powerful laptops
- Requires: Apple Silicon (M3 Pro+) or NVIDIA GPU (RTX 3070+ with 8GB+ VRAM)
Hardware Requirements¶
| Mode | Device | GPU | RAM | Network |
|---|---|---|---|---|
| Thin Client | Any (browser) | None | 2GB | Required |
| Split | Mac/Linux/Windows | None locally | 8GB+ | Required (to GPU server) |
| Fully Local (Mac) | M3 Pro or better | Integrated | 16GB+ | None |
| Fully Local (NVIDIA) | Any with GPU | RTX 3070+ (8GB) | 16GB+ | None |
Quality Modes¶
Users should be able to choose a quality/performance tradeoff, especially on lower-end hardware:
| Quality | Whisper Model | Kokoro Speed | Latency | VRAM |
|---|---|---|---|---|
| High | large-v3-turbo | Full quality | ~200ms | 6GB+ |
| Medium | medium | Slightly faster | ~150ms | 3GB+ |
| Low | small | Fast, lower accuracy | ~100ms | 1.5GB+ |
| Tiny | tiny | Fastest | ~50ms | <1GB |
Quality mode should be configurable per-user in the settings UI and stored in the project config. This allows users with weaker hardware (like an 8GB 3070 Ti) to drop to medium quality to free up VRAM for other tasks, or users on CPU-only to use tiny for acceptable latency.
3070 Ti Compatibility¶
An RTX 3070 Ti with 8GB VRAM can run both Kokoro TTS and Whisper STT, but not simultaneously with large models. Recommended setup:
- Whisper: large-v3-turbo (fits in 8GB) or medium for headroom
- Kokoro: runs alongside Whisper with room to spare
- Performance: comparable to RTX 3090, slightly lower throughput
- Use Medium quality mode if running other GPU workloads simultaneously
Installer¶
For public release, ClawMux needs a clean installer that:
- Detects the user's hardware (Apple Silicon, NVIDIA, CPU-only)
- Installs the right TTS/STT backends automatically
- Sets up the hub and default project
- Configures the deployment mode based on available hardware
- Provides a single command to start everything:
clawmux start
Renaming¶
The codebase needs cleanup:
- Rename the git repo folder from
voice-chattoclawmux - Rename the MCP server from
voice-hubtoclawmux - Update all references in CLAUDE.md files, skills, and documentation
Session Transfer Between Instances¶
Users need to seamlessly transfer between two ClawMux instances. Example: start a conversation on workstation, continue on Mac. This requires:
- History sync: Conversation history needs to be portable between instances (already stored as JSON files)
- Agent state transfer: Agent roles, assignments, and context need to move with the session
- Config portability: Project settings, voice assignments, deployment config should be shareable
- Real-time handoff: Ideally, one instance can "hand off" to another without losing the conversation flow
Research needed on the best approach: file sync (rsync/git), real-time replication, or export/import mechanism.
Current Status¶
The hub is fully functional on workstation (Mode 1/2 already work). Mode 3 needs mlx-audio and whisper.cpp integration. The installer is not started. Renaming is a straightforward find-and-replace task.