Skip to content

v0.2.0 - Voice Modes

Released features and improvements.

Features

  • MCP server + WebSocket bridge — Single-process server with FastMCP stdio transport and embedded FastAPI + WebSocket for browser audio.
  • Browser client — Vanilla JavaScript UI with tap-to-record mic, connection status indicator, and auto-reconnect.
  • Whisper STT integration — OpenAI-compatible speech-to-text via whisper.cpp on GPU.
  • Kokoro TTS integration — OpenAI-compatible text-to-speech via kokoro-fastapi on GPU.
  • Tailscale remote access — HTTPS + WSS proxy via tailscale serve for access from any device on the tailnet.
  • No recording timeout — Removed 120-second and 60-second timeouts on recording and playback waits.
  • Hardware requirements documentation — VRAM and RAM usage table in getting started guide.
  • Roadmap and project documentation — Structured docs site with Zensical, folder hierarchy, and navigation tabs.