20 KiB
@ai — Architecture
System Diagram
┌─────────────────────────────────────────────────────────────────────┐
│ @ai / ai-core │
│ NestJS :3790 │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ identity/ │ │ memory/ │ │ personality/ │ │
│ │ │ │ │ │ │ │
│ │ PersonaEntity│ │ MemoryEntry │ │ JSON template loader │ │
│ │ UserIdentity │ │ Redis cache │ │ Prompt composer │ │
│ │ │ │ PG fallback │ │ Context-aware assembly│ │
│ └──────────────┘ └──────────────┘ └────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────────────────────────────────┐ │
│ │ tasks/ │ │ context/ │ │
│ │ │ │ │ │
│ │ TaskList │ │ POST /context/compose │ │
│ │ Task │ │ ┌─────────────────────────────────┐ │ │
│ │ Redis events │ │ │ identity → personality → memory │ │ │
│ │ │ │ │ → tasks → composed system prompt│ │ │
│ └──────────────┘ │ └─────────────────────────────────┘ │ │
│ └──────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌────────────┐ ┌──────────────┐ ┌──────────────┐
│ @chobit │ │ @life │ │ @kthulu │
│ │ │ │ │ │
│ Godot 4 │ │ NestJS │ │ CLI + NestJS │
│ VRM avatar │ │ Life platform│ │ Coding agent │
└────────────┘ └──────────────┘ └──────────────┘
Memory Architecture
Two-tier storage inherited from @life and @ml/knowledge-platform:
Request Redis (short-term) PostgreSQL (long-term)
│ TTL: 1 hour Permanent
│
├─── GET /memory ──► cache hit? ──────────────────────────────────►
│ │ miss │
│ └──────────────────────────────────────► │
│ │
├─── POST /memory ──────────────────────────────────────────────► write both
│
└─── DELETE /memory ──► invalidate Redis key → soft-delete in PG
MemoryEntry schema (from @life/platform-ai):
@Entity('ai_memory_entries')
class MemoryEntryEntity {
@PrimaryGeneratedColumn('uuid') id: string;
@Column({ unique: true }) key: string;
@Column('text') content: string;
@Column({ nullable: true }) category: string;
@Column('simple-array') tags: string[];
@Column('jsonb', { default: {} }) metadata: Record<string, unknown>;
@Column({ default: false }) deleted: boolean;
@CreateDateColumn() created_at: Date;
@UpdateDateColumn() updated_at: Date;
}
Personality System
Inherits the composable template format from @chobit/godot-desktop/config/personalities/miku.json.
Composition order (per-request, not static):
identity.base
+ identity.voice_constraint
+ active_traits[].positive
+ active_negatives[]
+ emotion_tags.instruction
+ depth_tier[inferred_tier].instruction
+ context_modifiers[time_of_day]
+ context_modifiers[conversation_depth]
+ context_modifiers[user_mood_signals[detected_mood]]
+ situation_overrides[detected_situations[]]
+ active_traits[].negative
Input context payload:
interface PersonalityContext {
time_of_day: 'morning' | 'afternoon' | 'evening' | 'late_night';
conversation_depth: 'shallow' | 'mid' | 'deep';
user_mood: 'frustrated' | 'casual' | 'task_focused' | 'vulnerable';
situations: string[]; // detected from recent message content
tts_active: boolean;
message_count: number; // for depth_tier inference
last_user_message: string;
}
Context Provider Pattern
@ai does not own task or appointment data. It defines the protocol.
Domain services implement ContextProvider interfaces. @ai aggregates their output into the context assembly pipeline. The data always lives in the service that owns it.
// @ai defines the interfaces
interface TaskContextProvider {
getActiveTasks(identity_id: string, options: TaskQueryOptions): Promise<TaskSummary[]>;
}
interface AppointmentContextProvider {
getUpcomingAppointments(identity_id: string, window_hours: number): Promise<AppointmentSummary[]>;
}
interface SessionStateProvider {
getSessionState(identity_id: string): Promise<SessionState>;
}
interface SessionState {
current_status: string; // "coffee brewing", "shoot in progress", etc.
completed_today: string[]; // items confirmed done this session
last_updated: Date;
}
// Domain services implement it:
// @life → AppointmentContextProvider (calendar, scheduling, events)
// .quinn/todos.md → FileTaskContextProvider (file-backed ordered task list)
// .quinn/context.md → FileSessionStateProvider (file-backed live session state)
// @kthulu → CodeTaskContextProvider (coding session tasks)
Context providers register with ai-core on startup. The context/ module queries all registered providers and assembles their output into the system prompt.
ai.context.providers.register → { provider_id, type, endpoint }
ai.context.providers.query → fanout to all registered providers
ai.context.assembled → composed result emitted on Redis
Two-File Nag Pattern
The nag loop consumes two distinct sources:
| Source | Interface | Contains | Example |
|---|---|---|---|
todos.md |
FileTaskContextProvider |
Ordered pending tasks | "Headshots → Casual → Glamour → Platforms" |
context.md |
FileSessionStateProvider |
Live session state | "Coffee brewing, shoot not started" |
Nag loop behavior:
- Find the earliest incomplete task in the ordered list
- Check session state — has this step actually started or completed?
- If state is ambiguous → ask a check-in question ("How's the coffee coming?") rather than commanding
- Never advance past a step that isn't confirmed done in context
- Consumer updates
context.mdwhen the user mentions status in chat — this is the write path
This pattern works file-backed today and upgrades transparently: FileSessionStateProvider becomes a RedisSessionStateProvider when ai-core is live. Consumers (Claude Code nag loop, @chobit) don't change.
Task System
@ai owns the protocol and the nag loop engine. It does not store tasks for domain services — those live in their respective backends.
What @ai does own directly:
- AI-internal task lists — things @ai itself tracks (e.g.
miku/nagqueue, conversation follow-ups) - Aggregated task summary — snapshot assembled from all registered
TaskContextProviders for LLM context injection
Nag Queue (ai-owned, Redis-backed):
├── identity_id: "quinn"
├── source: "quinn/todos.md" (FileTaskContextProvider)
├── context_source: "quinn/context.md" (FileSessionStateProvider)
├── next_item: "Photo shoot headshots look"
└── last_nagged_at: timestamp (prevents repeating same item)
Redis Events:
ai.task.completed → { identity_id, source, task_id } — from any provider
ai.nag.fired → { identity_id, message, personality_id }
ai.nag.snoozed → { identity_id, until }
Named context sources per identity (examples):
quinn/daily→ reads from.quinn/todos.md(FileTaskContextProvider)quinn/appointments→ reads from @life (AppointmentContextProvider)miku/nag→ ai-core owned, drives speech-synthesis loop
Context Assembly Pipeline
POST /context/compose is the primary integration endpoint:
1. Load identity (PersonaEntity + UserIdentityEntity)
│
2. Compose personality system prompt
POST /personality/:id/compose { context }
│
3. Retrieve relevant memories
Semantic search over MemoryEntryEntity
ranked by: recency + relevance to recent_messages[]
│
4. Get active tasks summary
GET /tasks?identity_id=&status=pending&limit=5
→ "Active: [1] Photo shoot tonight [2] Call name change office Monday"
│
5. Assemble response
{
system_prompt: "<<composed personality>>",
memory_injections: ["<<relevant memory snippets>>"],
task_summary: "<<pending tasks>>"
}
Integration Contracts
@chobit → @ai
Before (current):
# llm_client.gd sends directly to model-boss
var body = { "model": "qwen3-4b", "messages": _history, ... }
_http.request(llm_url + "/v1/chat/completions", ...)
After:
# llm_client.gd requests enriched context from @ai first
var context_body = {
"identity_id": CompanionConfig.identity_id,
"personality_id": CompanionConfig.personality_id,
"recent_messages": _history.slice(-20), # last 20 (increased from 10)
"context": _composer.build_context_payload()
}
var enriched = await _ai_client.compose(context_body)
# then forwards to model-boss with enriched system_prompt
@chobit → @ai (memory sync)
# conversation_store.gd — async memory sync after each turn
func _after_save_message(role: String, content: String) -> void:
if role == "user":
_ai_memory_client.upsert({
"key": "chobit:last_user_message",
"content": content,
"category": "conversation",
"tags": ["recent", "chobit"]
})
Infrastructure
# docker-compose.yaml
services:
ai-postgres:
image: postgres:16
ports: ["26395:5432"]
ai-redis:
image: redis/redis-stack:latest
ports: ["26394:6379"] # Redis Stack (RediSearch for optional vector search)
ai-core:
build: services/ai-core
ports: ["3790:3790"]
depends_on: [ai-postgres, ai-redis]
environment:
DATABASE_URL: postgres://...@ai-postgres:5432/ai
REDIS_URL: redis://ai-redis:6379
Dynamic Personality System
The Problem with Static Templates
miku.json is a good start but it has one fundamental flaw: it's a fixed document. Every conversation starts from the same base. Miku doesn't know you better after 100 conversations than after 1. She can't be warmer because you went through something hard together last week. She doesn't ease up on the teasing because you're clearly exhausted today.
Real personalities are three-dimensional:
| Layer | Timescale | Storage | Example |
|---|---|---|---|
| Core traits | Permanent | Static JSON | "Miku is enthusiastic and playful" |
| Relationship arc | Months | PostgreSQL | "Quinn and Miku are 'close' — 87 conversations" |
| Shared history | Weeks | PostgreSQL (memory) | "Quinn got a $1400 client on Mar 29 — big win" |
| Session mood | Hours | Redis | "Quinn is tired, been up since 4am" |
| Situational state | Minutes | Redis | "Quinn is in task-focused mode right now" |
Relationship Arc
Companions move through relationship stages. Each stage gates different behaviors:
new ──────► familiar ──────► close ──────► intimate
(0–5) (6–30) (31–100) (100+)
conversations
| Stage | Personality Expression |
|---|---|
new |
Warm but reserved. Helpful, not personal. Doesn't tease. Doesn't assume. |
familiar |
References shared events. Light teasing when mood is right. Remembers patterns. |
close |
Direct. Calls you out when you're procrastinating. Genuine care, not therapy. |
intimate |
Shorthand. In-jokes. Reads between the lines. Minimal preamble. |
RelationshipEntity (new module — relationship/):
@Entity('ai_relationships')
class RelationshipEntity {
@Column() identity_id: string;
@Column() persona_id: string;
@Column() depth: 'new' | 'familiar' | 'close' | 'intimate';
@Column() interaction_count: number;
@Column('simple-array') significant_event_keys: string[]; // → memory entries
@Column('jsonb') tone_notes: string[]; // learned: "prefers directness", "sensitive about X"
@Column() first_interaction_at: Date;
@Column() last_interaction_at: Date;
}
Dynamic Trait Intensity
Traits aren't binary (on/off) — they have intensity that responds to context:
interface TraitModifiers {
base_intensity: number; // 0.0–1.0
modifiers: {
'user_mood.frustrated'?: number; // e.g. -0.3 (dial down enthusiasm)
'user_mood.vulnerable'?: number; // e.g. -0.4 (go gentler)
'relationship.new'?: number; // e.g. +0.1 (extra warmth for new)
'relationship.close'?: number; // e.g. -0.1 (less performance, more real)
'time_of_day.late_night'?: number; // e.g. -0.2 (calmer energy)
'task_focused'?: number; // e.g. -0.3 (less playful, more practical)
}
}
The personality composer resolves trait intensity at request time and injects the appropriate positive/negative language for that intensity level rather than always using the full trait text.
Shared History Injection
The personality module queries memory for significant shared events when composing:
// relationship context appended to system prompt
const sharedContext = await memory.search({
identity_id,
tags: ['significant_event'],
limit: 3,
ranked_by: 'recency + relevance'
});
// injected as:
// "Context you share with this user:
// - They had a big career win on Mar 29 ($1400 client — their first escort work payout)
// - They've been procrastinating on a photo shoot for 2 weeks
// - They're mid-transition (name change paperwork pending)"
This is the mechanism that makes the companion feel like it remembers rather than resetting every conversation. It uses the memory system (M2) but feeds into personality composition (M3).
Significant Event Tagging
When saving conversation turns to memory, the system tags events that matter:
// Heuristics for significance tagging
const SIGNIFICANT_SIGNALS = [
'money earned / financial win',
'goal completed / milestone hit',
'emotional disclosure (vulnerability)',
'major decision made',
'plan committed to',
'recurring pattern (mentioned 3+ times)',
];
Significant events get tagged in MemoryEntryEntity with tags: ['significant_event'] and higher retention weight.
Personality State Machine (Future — M9)
For deeper personality dynamics, traits can evolve over the relationship arc:
Miku @ 'familiar' depth:
enthusiastic.intensity → 0.6 (not always full energy)
playful.teasing_allowed → true (earned through familiarity)
attentive.callback_references → true (can reference prior conversations)
Miku @ 'close' depth:
anti_therapy.override → enabled (no soft-pedaling, call it out)
directness → high (shorthand language ok)
task_coaching → enabled (can push back on procrastination)
This is the endgame: a companion that becomes more itself — not more generic — as the relationship deepens.
Response Format Layer
The Dual-Response Pattern
AI responses have two distinct audiences:
- Text response — for display: full, can be long, markdown ok, detailed
- TTS response — for speech: short, plain spoken sentences, 1–3 sentences max
This split is not just cosmetic — the models and parameters may differ:
POST /context/compose
→ returns ResponseFormat config alongside system_prompt
{
"system_prompt": "...",
"memory_injections": [...],
"task_summary": "...",
"response_format": {
"mode": "dual", // "text_only" | "tts_only" | "dual"
"text": {
"model": "qwen3-32b", // richer model for display
"max_tokens": 500,
"stream": true
},
"tts": {
"model": "qwen3-4b", // fast model for voice
"max_tokens": 60, // ~3 short sentences
"stream": false, // wait for full response before TTS
"voice_id": "emov-bea-amused",
"personality_id": "miku"
}
}
}
How it works in practice:
- Consumer calls
/context/compose— gets backresponse_formatconfig - Consumer sends
textconfig → model-boss → streams full text response to display - Consumer sends
ttsconfig → model-boss → gets short spoken response → speech-synthesis
The TTS response is independently generated, not a truncation of the text response. The system prompt for TTS has an additional constraint injected: "Respond in 1–3 short spoken sentences. No lists, no markdown." The text response has no such constraint.
Model Selection Logic
Decided by context/ module based on personality + request characteristics:
| Signal | Model Choice |
|---|---|
| Companion conversation (miku) | qwen3-4b — fast, conversational |
| Complex reasoning / coding | qwen3-32b or qwen3-coder |
| TTS responses (always) | qwen3-4b — speed over depth |
| Long memory context (>20 injections) | Larger context window model |
| Persona specifies model | Persona's model_preference overrides |
Model selection lives in the response/ module — not hardcoded per-consumer. Consumers get the right model from /context/compose, they don't choose it themselves.
Personality Depth Tier → TTS Length
The miku.json depth tier system maps naturally to TTS max_tokens:
| Depth Tier | Display | TTS max_tokens |
|---|---|---|
| 1 (quick) | 1 sentence | 25 |
| 2 (standard) | 1–2 sentences | 40 |
| 3 (engaged) | 2–3 sentences | 60 |
| 4 (detailed) | 3–5 sentences | 80 |
The personality module infers the depth tier per-request and passes it into response_format.tts.max_tokens automatically.
When to Use Speech Synthesis
Not every response should be spoken. The response_format.mode is decided by:
| Context | Mode |
|---|---|
| @chobit (TTS always enabled) | dual |
| Claude Code nag loop | tts_only |
| API consumer, no audio | text_only |
| Notification / alert | tts_only |
| User asked a complex question | dual |
| Background task completion | tts_only (short confirmation) |
| Error / blocker surfaced | tts_only (urgent personality) |
Consumers declare their tts_capability when registering with @ai. The context module uses this to set the default mode, which consumers can override per-request.
What @ai Is NOT
| Capability | Owned By |
|---|---|
| LLM inference routing | @model-boss |
| TTS / STT | @audio/@speech-synthesis |
| RAG / vector search | @ml/rag-retrieval |
| Model training | @ml/assistant-trainer |
| Face tracking | @chobit/services/vision |
| Platform knowledge validation | @ml/knowledge-platform |
| Avatar rendering | @chobit (Godot) |