// Project

Cora AI

Cora AI

TypeScriptNode.jsElectronAIVoiceReal-Time

About this project

A real-time AI voice assistant that orchestrates speech-to-text, LLM inference, and text-to-speech into a single low-latency pipeline for fluid voice conversations. Features wake word activation, interruptible and patient conversation modes, and a model-agnostic architecture supporting Claude, GPT, Gemini, and local models.

Architecture Overview

Microphone → Wake Word Engine (Picovoice, runs locally) → Streaming STT (Deepgram Nova-2) → LLM with streaming inference (Claude/GPT/Gemini) → Streaming TTS (ElevenLabs) → Speaker. A Turn Manager enforces user-first priority across all stages, with conversation mode logic controlling interrupt behavior.

Key Features

Voice Pipeline

End-to-end streaming pipeline — Deepgram Nova-2 for speech-to-text, Claude/GPT for inference, ElevenLabs for text-to-speech — targeting sub-1-second total latency.

Interruptible Conversations

Speak mid-response and Cora immediately stops, listens, and responds with full context of what it already said — mimicking natural human conversation.

Model Agnostic

Swap the underlying LLM (Claude, GPT-4o, Gemini, or local models via Ollama) without rebuilding the pipeline. The architecture is designed to be provider-independent.

Wake Word Activation

Always-on local wake word detection via Picovoice Porcupine. No audio leaves the device until activation — privacy by design.

Three Conversation Modes

Interruptible mode for fast-paced Q&A, Patient mode for long instructions, and Text mode for traditional chat — all sharing one unified conversation history.

Cross-Platform

Desktop app (Electron), mobile app, and headless CLI mode. Minimal always-on-top floating window with waveform visualizer and status indicators.

Tech Stack

TypeScriptNode.jsElectronDeepgram Nova-2Claude APIElevenLabs TTSPicovoice PorcupineWebSocket

Development Roadmap

Voice Pipeline Architecture Design
Product Requirements Document
Web Presence (heycora.org)
Wake Word Detection (Porcupine)
Streaming STT (Deepgram)
LLM Integration (Claude API)
Streaming TTS (ElevenLabs)
Interrupt Handler & Turn Manager
Desktop App (Electron)
Text Mode Chat Interface
Mobile App