Skip to content

v0.4.0 - iOS App

iOS companion app, browser UI overhaul, and hub reliability fixes.

iOS Companion App

  • Native Swift app — Full iOS client with WebSocket connection to the hub, audio playback, recording, and conversation display.
  • Three input modes — Auto voice (hands-free), Push-to-Talk, and Typing mode with per-mode settings.
  • Typing mode — Text input with no TTS/STT. Hub text_mode flag skips audio processing.
  • PTT mode — Hold-to-talk with ZStack-centered mic button.
  • Background voice mode — Silence keepalive loop with VAD auto-stop for hands-free recording when app is backgrounded.
  • Live Activity — Mode-aware Dynamic Island and Lock Screen display with voicehub://mic deep link.
  • Local notifications — Sent on assistant_text when app is backgrounded.
  • Settings page — Configurable model, server URL, per-mode sound/haptic toggles, VAD tuning.
  • Thinking indicator — Shown immediately on thinking message from hub.
  • Audio cues — Thinking sounds, listening cue, processing cue, session ready chime.
  • Spotify compatibility — Audio session interruption observer pauses silence loop.

Browser UI

  • Visual overhaul — Dark/light mode with CSS custom properties, iOS-style design language, safe area support.
  • Voice grid landing page — Grid of voice cards with live session status, spawning feedback, and click-to-connect.
  • Home tab — Navigate back to voice grid without losing sessions.
  • Settings panel — Model selection, auto-record, auto-end, auto-interrupt toggles.
  • Debug panel — tmux sessions table, hub log viewer, Kill All Sessions button.
  • Text input mode — Type messages to the agent instead of speaking.
  • Thinking sounds — Double-tick audio pattern while agent processes.
  • Chat persistence — Messages saved to localStorage and restored on reload.

Hub Improvements

  • Goodbye parameterconverse() accepts goodbye=true to explicitly end sessions. Prevents premature session_ended on wait_for_response=false calls.
  • Thinking signal — Hub sends thinking message at start of handle_converse so clients show immediate feedback.
  • Session timeout — Increased to 120 minutes.
  • Resilient converse flow — v0.3.1 improvements to error handling and faster session spawning.
  • Multi-client broadcast fixsend_to_browser iterates over list(browser_clients) to avoid set-changed-during-iteration errors.
  • No-cache headers — Index route sends Cache-Control: no-cache so browsers always get fresh HTML.

Audio Fixes

  • Safari autoplay — Switched from new Audio() to Web Audio API (audioCtx.decodeAudioData) so second+ voice responses play on iOS Safari.
  • Repeated listening cue — Added pendingListenSessionId guard to ignore re-sent listening messages.
  • Listening cue cooldown — 2-second cooldown prevents rapid-fire cue sounds.
  • VAD grace period — 0.8s delay to ignore audio cue bleedthrough on mic open.

Documentation

  • CLAUDE.md — Added project instructions warning about hub.html syntax fragility.
  • Conversation dynamics — New doc explaining the converse cycle, message flow, and who controls what.
  • Agent reference docs — WebSocket protocol, UI behavior reference, iOS dev and web dev guides.