Ongoing

Project A.L.I.C.E

Lead AI Architect · 2026 · 3 Months · 1 person · 3 min read

A.L.I.C.E. (Autonomous Linked Intelligent Compute Engine) is a simple, hands-free, 100% locally running voice assistant that enables natural spoken interaction with Ollama LLMs, designed with future secure remote access in mind.

Overview

Architecting A.L.I.C.E., a lightweight voice interface for Ollama that allows completely offline, hands-free conversations using local speech-to-text, local LLM inference, and local text-to-speech, while being built as the foundation for a securely accessible compute engine.

Problem

Power users and privacy-conscious individuals want to interact naturally with their local Ollama models through voice, without typing or relying on cloud-based voice services. They also need a solution that can eventually evolve from a local-only assistant into a securely accessible intelligent compute engine running on a private server.

Constraints

Must run 100% locally today to guarantee absolute data privacy
Hands-free operation using reliable wakeword detection
Low enough resource usage to run comfortably on everyday hardware
Responsive latency for natural-feeling conversations
Designed with future secure remote access in mind (modular architecture)

Approach

Built a clean Python application that chains openwakeword for wakeword detection, webrtcvad for intelligent silence-based recording, faster-whisper for accurate offline transcription, Ollama for generative responses, and Piper TTS for fast, high-quality speech output.

Key Decisions

faster-whisper for Speech-to-Text

Reasoning:

Delivers strong accuracy and performance on CPU while remaining fully offline, with support for small efficient models like tiny.en and base.en.

Alternatives considered:

Original OpenAI Whisper (Slower without acceleration)
Cloud-based STT (Violates current local-only requirement)

Piper TTS for Speech Synthesis

Reasoning:

Extremely fast and lightweight with natural-sounding voices, providing low-latency responses even on modest hardware.

Alternatives considered:

eSpeak or Festival (Lower voice quality)
Cloud TTS services (Privacy and latency concerns)

openwakeword + webrtcvad for Listening Logic

Reasoning:

Combines reliable always-on wakeword detection with smart voice activity detection to record only when the user speaks, reducing false triggers and unnecessary processing.

Alternatives considered:

Continuous always-listening transcription (High CPU and battery impact)
Push-to-talk (Breaks hands-free experience)

Tech Stack

Python
Ollama
faster-whisper
Piper TTS
openwakeword
webrtcvad
sounddevice
Configurable via config.ini + CLI arguments

Result & Impact

100% Offline & Local (Today)

Privacy
Pre-packaged 'hey alice' model

Wakeword
Modular design ready for secure remote access

Future-Proofing

Created a practical, fully private voice assistant that lets users say 'Hey Alice', speak naturally, and receive spoken responses from their local Ollama models. The architecture is intentionally designed as the first stage of a larger Autonomous Linked Intelligent Compute Engine that can later run on a private server with secure remote access.

Learnings

Model size selection is critical: smaller models (tiny/base for whisper, 3B–8B for Ollama) provide the best balance between quality and real-time responsiveness on CPU.
Careful tuning of VAD aggressiveness and audio device settings dramatically improves real-world reliability across different microphones and environments.
Combining streaming responses from Ollama with fast Piper TTS creates a much more natural conversational flow.
Keeping the architecture modular makes future transition from local-only to secure remote server deployment much smoother.

Additional Context

This project is the initial implementation of A.L.I.C.E. — the Autonomous Linked Intelligent Compute Engine. While the current version runs 100% locally for maximum privacy, the architecture was intentionally designed with future expansion in mind: eventually moving the core engine to a private secure server that can be accessed safely from anywhere.

The current voice interface serves as the primary human interaction layer. It follows a clean, efficient flow:
Microphone → openwakeword (wakeword detection) → webrtcvad (silence detection) → faster-whisper (transcription) → Ollama LLM (reasoning) → Piper TTS (speech output).

By starting with a fully local, modular design using standard tools and clean separation of concerns, transitioning A.L.I.C.E. to a remotely accessible private server later will require minimal refactoring. The assistant already supports any Ollama model, custom prompts, and easy configuration — making it a solid foundation for a more powerful, linked intelligent system.

The result is a usable, private voice companion today, while laying the groundwork for a true Autonomous Linked Intelligent Compute Engine that maintains strong security and flexibility for the future.

All projects