Cora AI | Hayden Jose

// OVERVIEW

About this project

A real-time AI voice assistant that orchestrates speech-to-text, LLM inference, and text-to-speech into a single low-latency pipeline for fluid voice conversations. Features wake word activation, interruptible and patient conversation modes, and a model-agnostic architecture supporting Claude, GPT, Gemini, and local models.

// ARCHITECTURE

System architecture

Microphone → Wake Word Engine (Picovoice, runs locally) → Streaming STT (Deepgram Nova-2) → LLM with streaming inference (Claude/GPT/Gemini) → Streaming TTS (ElevenLabs) → Speaker. A Turn Manager enforces user-first priority across all stages, with conversation mode logic controlling interrupt behavior.

// FEATURES

Key features

Voice Pipeline

End-to-end streaming pipeline — Deepgram Nova-2 for speech-to-text, Claude/GPT for inference, ElevenLabs for text-to-speech — targeting sub-1-second total latency.

Interruptible Conversations

Speak mid-response and Cora immediately stops, listens, and responds with full context of what it already said — mimicking natural human conversation.

Model Agnostic

Swap the underlying LLM (Claude, GPT-4o, Gemini, or local models via Ollama) without rebuilding the pipeline. The architecture is designed to be provider-independent.

Wake Word Activation

Always-on local wake word detection via Picovoice Porcupine. No audio leaves the device until activation — privacy by design.

Three Conversation Modes

Interruptible mode for fast-paced Q&A, Patient mode for long instructions, and Text mode for traditional chat — all sharing one unified conversation history.

Cross-Platform

Desktop app (Electron), mobile app, and headless CLI mode. Minimal always-on-top floating window with waveform visualizer and status indicators.

// ROADMAP

Development roadmap

DONEVoice Pipeline Architecture Design
DONEProduct Requirements Document
ACTIVEWeb Presence (heycora.org)
PLANNEDWake Word Detection (Porcupine)
PLANNEDStreaming STT (Deepgram)
PLANNEDLLM Integration (Claude API)
PLANNEDStreaming TTS (ElevenLabs)
PLANNEDInterrupt Handler & Turn Manager
PLANNEDDesktop App (Electron)
PLANNEDText Mode Chat Interface
PLANNEDMobile App