Project A.L.I.C.E
A.L.I.C.E. (Autonomous Linked Intelligent Compute Engine) is a simple, hands-free, 100% locally running voice assistant that enables natural spoken interaction with Ollama LLMs, designed with future secure remote access in mind.
Overview
Architecting A.L.I.C.E., a lightweight voice interface for Ollama that allows completely offline, hands-free conversations using local speech-to-text, local LLM inference, and local text-to-speech, while being built as the foundation for a securely accessible compute engine.
Problem
Power users and privacy-conscious individuals want to interact naturally with their local Ollama models through voice, without typing or relying on cloud-based voice services. They also need a solution that can eventually evolve from a local-only assistant into a securely accessible intelligent compute engine running on a private server.
Constraints
- Must run 100% locally today to guarantee absolute data privacy
- Hands-free operation using reliable wakeword detection
- Low enough resource usage to run comfortably on everyday hardware
- Responsive latency for natural-feeling conversations
- Designed with future secure remote access in mind (modular architecture)
Approach
Built a clean Python application that chains openwakeword for wakeword detection, webrtcvad for intelligent silence-based recording, faster-whisper for accurate offline transcription, Ollama for generative responses, and Piper TTS for fast, high-quality speech output.
Key Decisions
faster-whisper for Speech-to-Text
Delivers strong accuracy and performance on CPU while remaining fully offline, with support for small efficient models like tiny.en and base.en.
- Original OpenAI Whisper (Slower without acceleration)
- Cloud-based STT (Violates current local-only requirement)
Piper TTS for Speech Synthesis
Extremely fast and lightweight with natural-sounding voices, providing low-latency responses even on modest hardware.
- eSpeak or Festival (Lower voice quality)
- Cloud TTS services (Privacy and latency concerns)
openwakeword + webrtcvad for Listening Logic
Combines reliable always-on wakeword detection with smart voice activity detection to record only when the user speaks, reducing false triggers and unnecessary processing.
- Continuous always-listening transcription (High CPU and battery impact)
- Push-to-talk (Breaks hands-free experience)
Tech Stack
- Python
- Ollama
- faster-whisper
- Piper TTS
- openwakeword
- webrtcvad
- sounddevice
- Configurable via config.ini + CLI arguments
Result & Impact
- 100% Offline & Local (Today)Privacy
- Pre-packaged 'hey alice' modelWakeword
- Modular design ready for secure remote accessFuture-Proofing
Created a practical, fully private voice assistant that lets users say 'Hey Alice', speak naturally, and receive spoken responses from their local Ollama models. The architecture is intentionally designed as the first stage of a larger Autonomous Linked Intelligent Compute Engine that can later run on a private server with secure remote access.
Learnings
- Model size selection is critical: smaller models (tiny/base for whisper, 3B–8B for Ollama) provide the best balance between quality and real-time responsiveness on CPU.
- Careful tuning of VAD aggressiveness and audio device settings dramatically improves real-world reliability across different microphones and environments.
- Combining streaming responses from Ollama with fast Piper TTS creates a much more natural conversational flow.
- Keeping the architecture modular makes future transition from local-only to secure remote server deployment much smoother.
Additional Context
This project is the initial implementation of A.L.I.C.E. — the Autonomous Linked Intelligent Compute Engine. While the current version runs 100% locally for maximum privacy, the architecture was intentionally designed with future expansion in mind: eventually moving the core engine to a private secure server that can be accessed safely from anywhere.
The current voice interface serves as the primary human interaction layer. It follows a clean, efficient flow:
Microphone → openwakeword (wakeword detection) → webrtcvad (silence detection) → faster-whisper (transcription) → Ollama LLM (reasoning) → Piper TTS (speech output).
By starting with a fully local, modular design using standard tools and clean separation of concerns, transitioning A.L.I.C.E. to a remotely accessible private server later will require minimal refactoring. The assistant already supports any Ollama model, custom prompts, and easy configuration — making it a solid foundation for a more powerful, linked intelligent system.
The result is a usable, private voice companion today, while laying the groundwork for a true Autonomous Linked Intelligent Compute Engine that maintains strong security and flexibility for the future.