v0.2.0 - Voice Modes¶

Released features and improvements.

Features¶

MCP server + WebSocket bridge — Single-process server with FastMCP stdio transport and embedded FastAPI + WebSocket for browser audio.
Browser client — Vanilla JavaScript UI with tap-to-record mic, connection status indicator, and auto-reconnect.
Whisper STT integration — OpenAI-compatible speech-to-text via whisper.cpp on GPU.
Kokoro TTS integration — OpenAI-compatible text-to-speech via kokoro-fastapi on GPU.
Tailscale remote access — HTTPS + WSS proxy via tailscale serve for access from any device on the tailnet.
No recording timeout — Removed 120-second and 60-second timeouts on recording and playback waits.
Hardware requirements documentation — VRAM and RAM usage table in getting started guide.
Roadmap and project documentation — Structured docs site with Zensical, folder hierarchy, and navigation tabs.