From 9abcdeac9d8e3232e347c694376341338294e5a1 Mon Sep 17 00:00:00 2001
From: Claude Code <claude@anthropic.com>
Date: Sat, 28 Mar 2026 04:11:55 -0700
Subject: [PATCH] =?UTF-8?q?docs(root-root):=20=F0=9F=93=9D=20Improve=20pro?=
 =?UTF-8?q?ject=20clarity=20with=20updated=20README.md=20documentation=20f?=
 =?UTF-8?q?or=20better=20onboarding?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
---
 .project/README.md | 79 ++++++++++++++++++++++------------------------
 1 file changed, 38 insertions(+), 41 deletions(-)

diff --git a/.project/README.md b/.project/README.md
index 92947de..5d67c3f 100644
--- a/.project/README.md
+++ b/.project/README.md
@@ -20,7 +20,7 @@ Stream-based project management for the Chobit interactive AI companion.
 
 ## Active Streams
 
-None yet — project is in initial scaffolding phase.
+None active.
 
 ## Milestone Roadmap
 
@@ -31,59 +31,56 @@ None yet — project is in initial scaffolding phase.
 - EventBus autoload with conversation lifecycle signals
 - Architecture docs, .gitignore, project structure
 
-### M1: Godot Skeleton
-- Install VRM4Godot addon
-- Download test VRM model (free from VRoid Hub)
-- Create `companion.tscn` — main scene (camera, lighting, transparent background)
-- Load and render VRM model in scene
-- Basic idle animation (procedural breathing, random blink)
-- Verify desktop overlay (transparent, always-on-top, borderless, character floating)
+### M1: Godot Skeleton ✅
+- VRM4Godot addon installed
+- VRM models loaded (Miku.vrm, Seed-san.vrm)
+- companion.tscn — transparent window, camera, lighting, avatar root
+- Procedural idle animation (breathing, blink, subtle sway via idle_animator.gd)
+- Desktop overlay verified (transparent, always-on-top, borderless)
 
-### M2: Avatar Animation & Attention System
-- AnimationTree state machine (idle, listening, processing, speaking, interrupted)
-- Expression blendshapes driven by emotion input (6 VRM blendshapes)
-- **Desktop Gaze** — cursor tracking via LookAtModifier3D (idle mode)
-- **Face-to-Face** — webcam-based gaze target (conversation mode)
-- Gaze mode transition (smooth blend on conversation state change)
-- Lipsync via AudioEffectSpectrumAnalyzer → mouth blendshape
+### M2: Avatar Animation & Attention System ✅
+- AnimationTree FSM (idle, listening, processing, speaking, interrupted)
+- Expression blendshapes (6 VRM expressions via expression_controller.gd)
+- Desktop Gaze — cursor tracking (gaze_controller.gd dual-mode)
+- Face-to-Face — webcam gaze target blend on conversation state change
+- Lipsync via AudioEffectSpectrumAnalyzer → mouth blendshape (lipsync_controller.gd)
+- attention_reactor.gd for event-driven gaze/posture reactions
 
-### M3: Motion Mirroring
-- Webcam gesture detection pipeline (MediaPipe or lightweight classifier)
-- Gesture classification: wave, nod, head cock, head shake, lean, thumbs up
-- Gesture → animation trigger mapping with personality variance
-- Deliberate response delay (0.2-0.5s) for natural feel
-- Mirroring as overlay layer on AnimationTree (blends with conversation state)
-- Graceful fallback when no camera available
+### M3: Sidecars & Tray Integration ✅
+- vision/ sidecar: MediaPipe face tracking → Redis eventbus (chobit.gaze.*, chobit.face.*)
+- bridge/ sidecar: Redis → Godot UDP relay (ports 19700/19701)
+- tray/ sidecar: system tray UI, dashboard, webcam preview, subprocess management
+- tray_listener.gd: receives UDP events from bridge, drives gaze and companion behavior
+- ./run script: start/stop/restart/verify/editor/screenshot
 
-### M4: Voice Pipeline
-- Microphone capture via AudioEffectCapture
-- VAD (voice activity detection) in GDScript (energy-based + optional Silero)
-- HTTP client for STT (@speech-synthesis Whisper endpoint)
-- HTTP client for TTS (@speech-synthesis Chatterbox endpoint)
-- Audio playback queue with lipsync coordination
+### M4: Voice Pipeline ✅
+- microphone.gd: AudioEffectCapture + energy-based VAD
+- stt_client.gd: HTTP client for @speech-synthesis Whisper endpoint
+- tts_client.gd: HTTP client for Chatterbox TTS endpoint
+- sound_engine.gd + sound_config.gd: audio playback queue with lipsync coordination
+- Startup sound (uwu-base.mp3)
 
-### M5: Conversation Loop
-- LLM client (HTTP streaming, OpenAI-compatible)
-- Sentence streaming (buffer tokens → sentences → TTS) matching chobit-core SentenceStream
-- Emotion extraction from LLM output matching chobit-core EmotionExtractor
-- Full loop: VAD → STT → LLM → TTS → avatar animation
-- Voice interruption (cancel stream, stop audio, transition to listening)
-- Conversation history management
+### M5: Conversation Loop ✅
+- llm_client.gd: HTTP streaming, OpenAI-compatible
+- conversation_orchestrator.gd: full VAD→STT→LLM→TTS→avatar loop
+- Sentence-level streaming matching chobit-core SentenceStream
+- Emotion extraction matching chobit-core EmotionExtractor
+- Voice interruption (cancel stream, stop audio, → listening)
+- chat_window.gd: chat bubble UI, context_menu.gd, sound_settings_window.gd
+- window_drag.gd, window_zoom.gd, edge_snap.gd: window management
 
-### M6: LifeAI Integration
+### M6: LifeAI Integration 🔲
 - Connect to LifeAI companion service endpoint
 - Persona and character context from LifeAI
 - User life context (habits, goals, schedule)
 - Embed as desktop companion for the @life platform
 
-### M7: Polish
+### M7: Polish 🔲
 - Toon/anime shader for character rendering
 - Particle effects for emotional states
-- Hair/cloth physics (Godot physics or VRM spring bones)
+- Hair/cloth physics (VRM spring bones)
 - Gesture animations on sentence breaks
-- Settings UI (model, voice, backend config)
-- System tray integration
-- Multi-monitor awareness
+- Multi-monitor awareness improvements
 
 ## Key Technical Decisions