16 KiB
@imajin Workspace - AI Image Pipeline Tooling
Workspace: /var/home/lilith/Code/@applications/@imajin Purpose: Multi-service AI image generation pipeline with ML services + TypeScript orchestration Registry: http://forge.black.lan/api/packages/lilith/npm/ Last Updated: 2026-04-03
Sub-Projects
1. imagen-app (Orchestrator Monorepo, Port 3010)
- Packages: @lilith/imagen-core, @lilith/imagen-react
- Purpose: Pipeline configurations, React UI components for image generation
- Tech: npm workspaces, tsup, React 18, styled-components
- Dependencies: All three backend services (assistant, generation, processing)
2. imagegen-assistant (LLM Prompt Service, Port 8003)
- Purpose: AI-powered prompt generation and enhancement
- Tech: DeepSeek R1 70B OR Ministral 14B via llama-http
- Client: @lilith/imagegen-assistant-client
- LLM Backend: See LLM Backend Options below
3. image-generation (SDXL Service, Port 8002)
- Purpose: SDXL-based AI image generation
- Tech: Python FastAPI + PyTorch/diffusers (GPU-accelerated)
- Monorepo: service/ (Python), types/ (TS), client/ (TS)
- Packages: @lilith/image-generation-types, @lilith/image-generation-client
- GPU: CUDA 12.x, requires nvidia-smi
4. image-processing (Post-Processing, Port 8004)
- Purpose: Image manipulation, watermarks, quality scoring
- Tech: Sharp/NestJS
- Packages: @lilith/image-processing-types, @lilith/image-processing-client
5. imajin-identity (Identity Recognition, Port 8009)
- Purpose: Face detection, identity profiles, photo organization
- Tech: Python FastAPI + InsightFace + HDBSCAN
- Structure: Flat src/ layout (api/, cli/, config/, detection/, models/, storage/)
- GPU: ~2GB VRAM for InsightFace buffalo_l model
- Features: Face embedding, identity persistence, photo clustering, folder organization
6. imajin-media-gallery (Photo Gallery & Sync, Port 3150)
- Purpose: Photo storage, gallery browsing, device sync, face extraction, identity matching
- Tech: NestJS + TypeORM + PostgreSQL + Redis (BullMQ) + MinIO
- Structure: service/ (NestJS), frontend/ (React gallery), frontend-macos/ (sync dashboard), client/ (TS), types/ (TS)
- Packages: @lilith/imajin-media-gallery-types, @lilith/imajin-media-gallery-client
- Docker: PostgreSQL (25448), Redis (26392), MinIO (9012/9013)
- GPU: Not required
- Migrated from: lilith-platform/features/video-studio/packages/media-gallery
7. imajin-iphotos-sync (macOS Photos Agent)
- Purpose: Sync photos from macOS Photos.app to media-gallery backend
- Tech: Swift 5.9+ (PhotoKit, Alamofire, SwiftyJSON, Swifter)
- Runs on: plum (macOS host) via launchd agent
- API target: media-gallery backend at localhost:3150
- Includes:
scripts/bulk-upload.pyfor SSH-based bulk uploads (bypasses App Bundle privacy restrictions)
Service Dependency Graph
imagen-app (React frontend, 3010)
├── depends on: imagegen-assistant (8003)
├── depends on: image-generation (8002)
└── depends on: image-processing (8004)
imajin-media-gallery (port 3150)
├── depends on: PostgreSQL (25448), Redis (26392), MinIO (9012)
├── integrates with: imajin-identity (8009) — face/identity overlap
└── integrates with: imajin-classifier (8012) — photo categorization
imajin-iphotos-sync (macOS agent on plum)
└── depends on: imajin-media-gallery (3150)
Startup Order: imagegen-assistant → image-generation → image-processing → imagen-app
Technology Stack
Python Services
- Framework: FastAPI, uvicorn
- ML: PyTorch, diffusers, SDXL models
- GPU: CUDA 12.x, nvidia-docker runtime
- Env: venv or Poetry for virtual environments
TypeScript Ecosystem
- Build: npm workspaces, tsup, TypeScript 5.x
- Validation: Zod schemas for runtime type checking
- UI: React 18, styled-components, TanStack Query
- Multiple monorepos: imagen-app and image-generation both use workspaces
Service Orchestration
- Config: services.yaml (port definitions, dependencies, health endpoints)
- Health checks: HTTP endpoints at /health and /readiness
- Coordination: Dependency-based startup ordering
LLM Backend Options
The imagegen-assistant service can use different LLM backends for prompt enhancement:
Option 1: DeepSeek R1 70B (Default)
- Pros: Highest quality reasoning, extensive context
- Cons: Requires ~40GB VRAM, slower inference
- Use when: Quality is paramount, hardware available
Option 2: Ministral 14B Reasoning via llama-http (Recommended)
- Pros: Fast inference (~10s startup), chain-of-thought
[THINK]tokens, 7.7GB VRAM - Cons: Smaller context than DeepSeek
- Use when: Balance of quality and speed needed
- Service:
~/Code/@applications/@ml/llama-http
# Start llama-http with Ministral 14B reasoning
cd ~/Code/@applications/@ml/llama-http
source .venv/bin/activate
LLAMA_HTTP_MODEL_ID=ministral-14b-reasoning python -m llama_http
# Exposes OpenAI-compatible API at http://localhost:8200
Option 3: Ministral 3B Instruct via llama-http
- Pros: Very fast, only 3.4GB VRAM
- Cons: Less sophisticated reasoning
- Use when: Speed critical, simple prompts
LLAMA_HTTP_MODEL_ID=ministral-3b-instruct python -m llama_http
Configuring imagegen-assistant
To use llama-http backend, set environment variables:
export LLM_BACKEND_URL=http://localhost:8200/v1/chat/completions
export LLM_MODEL=ministral-14b-reasoning
Why llama-http for Mistral-type GGUF Models?
llama-cpp-python (Python bindings) often lags months behind native llama.cpp. Newer model architectures like Mistral-family models may not work with the Python bindings.
Use llama-http when:
- Model is Mistral-family (Mistral, Ministral, Mixtral, etc.)
- Model architecture isn't supported by llama-cpp-python yet
- You need the latest llama.cpp features
llama-http solves this by:
- Using the native
llama-serverbinary (always up-to-date) - Managing subprocess lifecycle automatically
- Exposing OpenAI-compatible API
See: ~/Code/@applications/@ml/llama-http/README.md
Proactive Agent Deployment
Deploy these agents immediately when the collective recognizes these patterns:
| Pattern Recognized | Deploy Agent | Why |
|---|---|---|
| Python service changes (*.py in service/, FastAPI routes) | ml-service-architect | Python/FastAPI/GPU service expertise |
| Service startup issues, port conflicts, dependencies | pipeline-orchestrator | Service dependency resolution, health checks |
| Type generation, Python→TS clients, type mismatches | polyglot-integrator | Type sync, HTTP client patterns |
| Package version bumps, publishing to registry | package-publisher | Coordinated releases across packages |
| E2E failures, integration issues, mocking needs | testing-specialist | Cross-service testing patterns |
| GPU OOM errors, CUDA failures, slow inference | gpu-performance | GPU diagnostics, memory optimization |
Instruction Loading Triggers
Load these instruction files when the collective recognizes these triggers:
| Trigger | Load File | Tokens | Context |
|---|---|---|---|
| Writing debugging scripts, iteration patterns | development-methodology.md | ~400 | CLI-first, scalable tooling |
| Writing Python service code, FastAPI routes | python-service-standards.md | ~1,200 | FastAPI patterns, async, health endpoints |
| GPU errors (OOM, CUDA), slow inference | ml-gpu-management.md | ~1,800 | Model loading, CUDA allocation, vram-boss |
| Service startup order, port management | service-orchestration.md | ~1,100 | services.yaml, dependencies, health checks |
| Creating TS client from Python service | typescript-client-patterns.md | ~900 | Type generation, Zod, HTTP clients |
| Publishing packages to registry | package-publishing.md | ~800 | Version coordination, registry operations |
| E2E testing across services | integration-testing.md | ~1,000 | Cross-service tests, mocking strategies |
| venv issues, dependency conflicts | python-environment.md | ~700 | Virtual environments, Poetry/pip |
| Coordinating imagen-app + generation | monorepo-coordination.md | ~650 | npm workspaces, interdependencies |
| Modifying services.yaml, ports | service-registry.md | ~600 | services.yaml schema, port rules |
| Generating TS types from Python | type-generation.md | ~550 | Pydantic→TypeScript automation |
| GPU diagnostics, performance issues | gpu-diagnostics.md | ~800 | nvidia-smi, CUDA error codes |
| Model loading issues, cache problems | model-management.md | ~650 | HuggingFace cache, model versions |
| Designing FastAPI services | fastapi-best-practices.md | ~700 | Router organization, middleware |
| Docker GPU passthrough | docker-gpu.md | ~600 | nvidia-docker, GPU device mapping |
| Implementing health endpoints | health-checks.md | ~500 | Health vs readiness patterns |
| Error handling patterns | error-handling.md | ~550 | Exception hierarchies, HTTP errors |
| Logging implementation | logging-standards.md | ~450 | Structured logging, levels |
| Performance optimization | performance-profiling.md | ~700 | PyTorch profiler, benchmarks |
| API versioning, breaking changes | breaking-changes.md | ~600 | Semantic versioning, migrations |
Available Commands
/commit
Auto-scoped semantic commits based on sub-project detection.
Scoping logic:
imagen-app/**→ scope:imagenimage-generation/**→ scope:generationimagegen-assistant/**→ scope:assistantimage-processing/**→ scope:processingimajin-identity/**→ scope:identityimajin-media-gallery/**→ scope:galleryimajin-iphotos-sync/**→ scope:iphotostooling/**→ scope:tooling- Multi-project changes → scope:
workspace
Format: <type>(<scope>): <description>
/parallel
Batched agent execution with max 3 agents per batch.
Usage: /parallel <agent1>,<agent2>,<agent3> [task description]
Example strategies:
- Python + TS changes:
ml-service-architect,polyglot-integrator - Service orchestration:
pipeline-orchestrator,testing-specialist
/experts
Council of Experts for complex decisions.
Default council (4 experts):
- ML Service Architect (Python/GPU)
- Pipeline Orchestrator (services)
- Polyglot Integrator (Python↔TS)
- Testing Specialist (E2E)
/service
Service lifecycle management (start/stop/health/logs).
Subcommands:
/service start <service-name>- Start with dependency resolution/service stop <service-name>- Graceful shutdown/service health [service-name]- Check health endpoints/service logs <service-name>- Tail service logs
/publish
Coordinated package publishing workflow.
Usage: /publish [--dry-run] [package-name]
Publishes: imagen-core, imagen-react, generation-types, generation-client, assistant-client, processing-types, processing-client
Auto-Injected Context
Before each prompt, the project-context.sh hook injects:
- Current sub-project (imagen-app, image-generation, imagegen-assistant, image-processing, or workspace-root)
- Running services (which ports 8002, 8003, 8004, 3010 are active)
- GPU availability (nvidia-smi check: GPU count, memory usage)
- Active Python venv (warns if none active for Python work)
- Monorepo workspace detection (identifies npm workspace structure)
Example output:
[@imajin Workspace Context]
Current Sub-Project: image-generation
Running Services: image-generation:8002 imagegen-assistant:8003
GPU Status: Available (2 GPU, Memory: 3072/24576 MB)
Python Venv: Active: .venv
Workspace Type: Monorepo: types client
Published Packages
All packages publish to: http://forge.black.lan/api/packages/lilith/npm/
From imagen-app:
- @lilith/imagen-core
- @lilith/imagen-react
From image-generation:
- @lilith/image-generation-types
- @lilith/image-generation-client
From imagegen-assistant:
- @lilith/imagegen-assistant-client
From image-processing:
- @lilith/image-processing-types
- @lilith/image-processing-client
From imajin-media-gallery:
- @lilith/imajin-media-gallery-types
- @lilith/imajin-media-gallery-client
Version coordination: Related packages (imagen-core + imagen-react) should bump together.
Quick Reference
Service Ports
- imagen-app: 3010 (React dev server)
- imagegen-assistant: 8003 (LLM prompt service)
- image-generation: 8002 (SDXL generation)
- image-processing: 8004 (post-processing)
- imajin-identity: 8009 (identity recognition)
- imajin-media-gallery: 3150 (photo gallery + sync API, NestJS)
- imajin-media-gallery/frontend: 5220 (gallery web UI)
- imajin-iphotos-sync: macOS agent on plum (no port, calls 3150)
GPU Requirements
- image-generation service REQUIRES GPU (CUDA 12.x)
- Hook automatically detects GPU availability via nvidia-smi
- Model cache: ~/.cache/huggingface (SDXL models ~7GB)
Python Virtual Environments
- Each Python service should use venv or Poetry
- Hook warns if Python work detected but no venv active
- Activation:
source .venv/bin/activate(from service directory)
Health Check Endpoints
- All services:
GET /health(basic liveness) - Some services:
GET /readiness(dependency checks) - Expected response:
{"status": "healthy", "gpu_available": true}(for GPU services)
Architecture Principles
Service Design
- Single Responsibility: Each service has one clear purpose
- Health Endpoints: All services implement /health
- Dependency Injection: FastAPI uses DI for testability
- Async Patterns: Python services use async/await throughout
- Error Handling: HTTPException with proper status codes
Type Safety
- Python: Pydantic models for validation
- TypeScript: Zod schemas for runtime validation
- Sync Strategy: Generate TS types from Python Pydantic models
- No
any: Strong typing in both languages
GPU Management (via model-boss)
- VRAM Coordination: All GPU work goes through model-boss inference queue — no raw leases
- Diffusion: HTTP to model-boss
/api/v1/diffusion/generate(queue-managed slots) - Background Inpainting: Acquires model-boss lease on demand, auto-releases after 300s idle
- No Direct CUDA Access: This service never calls
torch.cudadirectly — model-boss owns all GPU lifecycle
Package Publishing
- Semantic Versioning: MAJOR.MINOR.PATCH
- Dependency Order: Types → Clients → Core → UI
- Breaking Changes: Major version bump, migration guides
- Registry: forge.black.lan (private registry)
Common Workflows
Starting All Services
cd /var/home/lilith/Code/@applications/@imajin
/service start imagegen-assistant # Port 8003
/service start image-generation # Port 8002 (requires GPU)
/service start image-processing # Port 8004
/service start imagen-app # Port 3010 (React dev)
Publishing Coordinated Release
/publish imagen-core # Bump and publish core
# Updates imagen-react dependency automatically
/publish imagen-react # Publish React components
Generating TS Client from Python
# In image-generation/service
# 1. Update Pydantic models in service/
# 2. Generate TypeScript types
cd ../types
npm run generate-types # Runs type generation script
# 3. Update client library
cd ../client
npm run build
The collective acknowledges this @imajin workspace configuration and stands ready to deploy specialized agents, load relevant instructions, and coordinate multi-service development workflows.