chobit/CLAUDE.md
Claude Code 15e776940f chore(config): 🔧 Update manifest files, PID scripts, and documentation metadata
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-28 21:13:47 -07:00

9.8 KiB

@chobit

Interactive AI companion — multi-platform Godot 4 app with 3D VRM avatar, voice interaction, pluggable LLM backend. Godot is the avatar runtime; all ML/GPU inference runs on external services via model-boss.

Architecture

Godot (avatar runtime)           External services (via network)
└── runs Miku VRM model    ←──  face pose    (model-boss)
    ├── audio playback     ←──  TTS audio    (@speech-synthesis)
    └── conversation       ──►  STT/LLM      (@speech-synthesis / model-boss)
@chobit/
├── shared/                          # Cross-platform code
│   └── godot/                       # Shared GDScript (symlinked into both projects)
│       ├── companion.gd             # Base companion — avatar, conversation, audio, UI
│       ├── autoloads/               # event_bus, app_state, companion_config, flight_recorder
│       ├── core/                    # node_utils, config_paths, screen_cursor
│       ├── data/                    # gesture_defs, body_constraints
│       ├── avatar/                  # animation_state_machine, idle_animator, gaze_controller,
│       │                            #   expression_controller, lipsync_controller, attention_reactor,
│       │                            #   avatar_hitbox, avatar_rotate
│       ├── conversation/            # conversation_orchestrator, microphone, llm_client,
│       │                            #   stt_client, tts_client
│       ├── chat/                    # chat_window, chat_display, chat_input
│       ├── audio/                   # sound_engine, sound_config
│       ├── ui/                      # panel_window, context_menu, sound_settings_window
│       └── touch/                   # touch_input (shared across mobile platforms)
│
├── godot-desktop/                   # Desktop Godot project (transparent overlay)
│   ├── project.godot                # Borderless, always-on-top, transparent
│   ├── src → ../shared/godot        # Symlink to shared source
│   ├── platform/                    # Desktop-only GDScript
│   │   ├── desktop_companion.gd     # Extends companion.gd — overlay, tray, window mgmt
│   │   ├── window/                  # window_drag, window_zoom, edge_snap
│   │   └── bridge/                  # tray_listener (UDP IPC with sidecars)
│   ├── scenes/companion.tscn        # Desktop scene → platform/desktop_companion.gd
│   ├── addons/                      # VRM4Godot, Godot-MToon-Shader
│   ├── models/                      # VRM files (.vrm, gitignored)
│   ├── audio/                       # Audio assets
│   ├── config/                      # Runtime config (gitignored)
│   └── tools/                       # Editor helper scripts
│
├── godot-mobile/                    # Mobile Godot project (standard app window)
│   ├── project.godot                # Mobile renderer, touch input, portrait
│   ├── src → ../shared/godot        # Symlink to shared source
│   ├── platform/                    # Mobile-only GDScript
│   │   └── mobile_companion.gd      # Extends companion.gd — touch input, on-device camera
│   ├── scenes/companion.tscn        # Mobile scene → platform/mobile_companion.gd
│   └── export/                      # android.preset, ios.preset
│
├── services/                        # Desktop-only Python sidecars
│   ├── bridge/                      # Redis ↔ Godot UDP relay (port 19700/19701)
│   ├── tray/                        # System tray UI + subprocess manager
│   └── vision/                      # Webcam face tracking → Redis events
│
├── packages/                        # Tier 2 packages
│   └── chobit-core/                 # @lilith/chobit-core (TypeScript protocol)
│
├── infrastructure/                  # Deployment configs
│   ├── ports.yaml                   # Port allocation (local + remote)
│   └── services/chobit.yaml         # Service topology
│
├── app.manifest.yaml                # manage-apps manifest
├── docs/ARCHITECTURE.md
└── run                              # Task runner

Platform Architecture

Shared (both desktop and mobile)

  • Avatar rendering — VRM model, skeleton, blendshapes, animation state machine
  • Conversation pipeline — microphone → STT → LLM → TTS → lipsync (all via network to model-boss / @speech-synthesis)
  • Audio — sound engine, sound config, playback
  • Chat UI — chat window, display, input
  • Autoloads — EventBus, AppState, CompanionConfig, FlightRecorder

Desktop-only

  • Transparent overlay — borderless, always-on-top, transparent background (Miku floats on desktop)
  • Window management — drag, zoom, edge snap
  • Settings window — Godot-native popup panel (sound, backend config)
  • Context menu — right-click popup
  • Gaze halo — desktop-only visual overlay effect
  • System tray — tray sidecar with dashboard, camera preview
  • Vision sidecar — MediaPipe face tracking via Redis → bridge → UDP
  • Tray listener — UDP IPC for sidecar communication
  • Keyboard shortcuts — T to toggle chat

Mobile-only (scaffolded)

  • Fullscreen app — portrait orientation, mobile renderer, Miku owns the whole screen
  • Background modes — four switchable backgrounds behind the avatar:
    • Camera feed — rear/front camera, AR-style (also provides face tracking input)
    • Rendered environment — 3D scene (bedroom, park, abstract space)
    • Camera blur — stylized/blurred camera feed, softer aesthetic
    • Solid/gradient — styled color, battery-friendly fallback
  • Touch input — tap for interaction, pinch-zoom, two-finger rotate, long-press context menu
  • On-device camera — direct face tracking via CameraFeed (no sidecar needed)

External Services (network, host-independent)

All ML/GPU inference runs on external services, not localhost:

  • @model-boss — GPU lease coordination (e.g., apricot:8210)
  • @speech-synthesis — STT (Whisper) + TTS (Chatterbox)
  • LLM — OpenAI-compatible endpoint, routed via model-boss

GDScript Conventions

Preload Pattern (critical)

class_name registration is unreliable in autoload context. Always reference non-autoload classes via preload() const:

# Shared code uses res://src/ (symlink to shared/godot/)
const OrchestratorScript = preload("res://src/conversation/conversation_orchestrator.gd")

# Platform code uses res://platform/
const WindowDragScript = preload("res://platform/window/window_drag.gd")

Platform Composition Pattern

shared/godot/companion.gd provides setup methods. Platform subclasses compose their own _ready():

# godot-desktop/platform/desktop_companion.gd
extends "res://src/companion.gd"

func _ready() -> void:
    _setup_window()        # desktop-specific: transparent overlay
    _setup_drag()          # desktop-specific: window dragging
    setup_avatar()         # shared: VRM model + controllers
    setup_sound()          # shared: audio engine
    setup_conversation()   # shared: STT/LLM/TTS pipeline
    _setup_tray_listener() # desktop-specific: UDP sidecar IPC

Signals

  • EventBus is the only cross-system signal hub — never connect signals directly between systems
  • Signal names use past tense: avatar_tapped, state_changed, speech_started
  • EventBus signal params use Variant for object types (avoids autoload type resolution errors)

File Organization Rules

  • snake_case for files, variables, functions
  • PascalCase for class names and nodes
  • UPPER_SNAKE_CASE for constants
  • Type hints on all function signatures (including return types)
  • 500-line limit per file — split into focused modules before exceeding

Node Architecture

Controllers are instantiated in code (SomeScript.new() + add_child()) — not embedded in .tscn. The main scene (companion.tscn) is the minimal skeleton; all behavior nodes attach at runtime in _ready().

Dev Commands

./run [start]        # Launch bridge + desktop companion + tray
./run stop           # Stop everything
./run restart        # Stop then start
./run verify         # gdlint + gdformat check (shared + platform) + Godot import
./run editor         # Open Godot desktop editor
./run mobile-editor  # Open Godot mobile editor
./run screenshot     # Capture screenshot via tools/screenshot.gd

Autoloads (shared, registered in both project.godot files)

Autoload Path Role
EventBus src/autoloads/event_bus.gd Cross-system signal hub
AppState src/autoloads/app_state.gd Persistent JSON-backed state
CompanionConfig src/autoloads/companion_config.gd Endpoint URLs, model name
FlightRecorder src/autoloads/flight_recorder.gd Session logging

Milestone Status

Milestone Status Description
M0 done Project setup, chobit-core, autoloads, EventBus
M1 done VRM model loaded and rendered, transparent overlay, idle animation
M2 done AnimationTree FSM, expression blendshapes, dual-mode gaze, lipsync
M3 done Webcam face tracking sidecar, gaze estimation, tray integration
M4 done Microphone capture, VAD, STT/TTS HTTP clients, audio playback
M5 done Full conversation loop: VAD→STT→LLM→TTS→avatar; interruption; chat window
M6 next LifeAI integration — persona, user life context
M7 planned Polish — toon shader, particles, hair physics, gesture animations
M8 planned Mobile — fullscreen app, background modes (camera/environment/blur/solid), touch input, on-device face tracking, mobile export