v0.2.0 - Voice Modes¶
Released features and improvements.
Features¶
- MCP server + WebSocket bridge — Single-process server with FastMCP stdio transport and embedded FastAPI + WebSocket for browser audio.
- Browser client — Vanilla JavaScript UI with tap-to-record mic, connection status indicator, and auto-reconnect.
- Whisper STT integration — OpenAI-compatible speech-to-text via whisper.cpp on GPU.
- Kokoro TTS integration — OpenAI-compatible text-to-speech via kokoro-fastapi on GPU.
- Tailscale remote access — HTTPS + WSS proxy via
tailscale servefor access from any device on the tailnet. - No recording timeout — Removed 120-second and 60-second timeouts on recording and playback waits.
- Hardware requirements documentation — VRAM and RAM usage table in getting started guide.
- Roadmap and project documentation — Structured docs site with Zensical, folder hierarchy, and navigation tabs.