autocommit 6774ab5e5f docs(claude): 📝 Update publishing workflow and command reference documentation for Claude tooling

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>

2026-06-10 03:58:34 -07:00

16 KiB

Raw Blame History

@imajin Workspace - AI Image Pipeline Tooling

Workspace: /var/home/lilith/Code/@applications/@imajin Purpose: Multi-service AI image generation pipeline with ML services + TypeScript orchestration Registry: http://forge.black.lan/api/packages/lilith/npm/ Last Updated: 2026-04-03

Sub-Projects

1. imagen-app (Orchestrator Monorepo, Port 3010)

Packages: @lilith/imagen-core, @lilith/imagen-react
Purpose: Pipeline configurations, React UI components for image generation
Tech: npm workspaces, tsup, React 18, styled-components
Dependencies: All three backend services (assistant, generation, processing)

2. imagegen-assistant (LLM Prompt Service, Port 8003)

Purpose: AI-powered prompt generation and enhancement
Tech: DeepSeek R1 70B OR Ministral 14B via llama-http
Client: @lilith/imagegen-assistant-client
LLM Backend: See LLM Backend Options below

3. image-generation (SDXL Service, Port 8002)

Purpose: SDXL-based AI image generation
Tech: Python FastAPI + PyTorch/diffusers (GPU-accelerated)
Monorepo: service/ (Python), types/ (TS), client/ (TS)
Packages: @lilith/image-generation-types, @lilith/image-generation-client
GPU: CUDA 12.x, requires nvidia-smi

4. image-processing (Post-Processing, Port 8004)

Purpose: Image manipulation, watermarks, quality scoring
Tech: Sharp/NestJS
Packages: @lilith/image-processing-types, @lilith/image-processing-client

5. imajin-identity (Identity Recognition, Port 8009)

Purpose: Face detection, identity profiles, photo organization
Tech: Python FastAPI + InsightFace + HDBSCAN
Structure: Flat src/ layout (api/, cli/, config/, detection/, models/, storage/)
GPU: ~2GB VRAM for InsightFace buffalo_l model
Features: Face embedding, identity persistence, photo clustering, folder organization

6. imajin-media-gallery (Photo Gallery & Sync, Port 3150)

Purpose: Photo storage, gallery browsing, device sync, face extraction, identity matching
Tech: NestJS + TypeORM + PostgreSQL + Redis (BullMQ) + MinIO
Structure: service/ (NestJS), frontend/ (React gallery), frontend-macos/ (sync dashboard), client/ (TS), types/ (TS)
Packages: @lilith/imajin-media-gallery-types, @lilith/imajin-media-gallery-client
Docker: PostgreSQL (25448), Redis (26392), MinIO (9012/9013)
GPU: Not required
Migrated from: lilith-platform/features/video-studio/packages/media-gallery

7. imajin-iphotos-sync (macOS Photos Agent)

Purpose: Sync photos from macOS Photos.app to media-gallery backend
Tech: Swift 5.9+ (PhotoKit, Alamofire, SwiftyJSON, Swifter)
Runs on: plum (macOS host) via launchd agent
API target: media-gallery backend at localhost:3150
Includes: scripts/bulk-upload.py for SSH-based bulk uploads (bypasses App Bundle privacy restrictions)

Service Dependency Graph

imagen-app (React frontend, 3010)
  ├── depends on: imagegen-assistant (8003)
  ├── depends on: image-generation (8002)
  └── depends on: image-processing (8004)

imajin-media-gallery (port 3150)
  ├── depends on: PostgreSQL (25448), Redis (26392), MinIO (9012)
  ├── integrates with: imajin-identity (8009) — face/identity overlap
  └── integrates with: imajin-classifier (8012) — photo categorization

imajin-iphotos-sync (macOS agent on plum)
  └── depends on: imajin-media-gallery (3150)

Startup Order: imagegen-assistant → image-generation → image-processing → imagen-app

Technology Stack

Python Services

Framework: FastAPI, uvicorn
ML: PyTorch, diffusers, SDXL models
GPU: CUDA 12.x, nvidia-docker runtime
Env: venv or Poetry for virtual environments

TypeScript Ecosystem

Build: npm workspaces, tsup, TypeScript 5.x
Validation: Zod schemas for runtime type checking
UI: React 18, styled-components, TanStack Query
Multiple monorepos: imagen-app and image-generation both use workspaces

Service Orchestration

Config: services.yaml (port definitions, dependencies, health endpoints)
Health checks: HTTP endpoints at /health and /readiness
Coordination: Dependency-based startup ordering

LLM Backend Options

The imagegen-assistant service can use different LLM backends for prompt enhancement:

Option 1: DeepSeek R1 70B (Default)

Pros: Highest quality reasoning, extensive context
Cons: Requires ~40GB VRAM, slower inference
Use when: Quality is paramount, hardware available

Option 2: Ministral 14B Reasoning via llama-http (Recommended)

Pros: Fast inference (~10s startup), chain-of-thought [THINK] tokens, 7.7GB VRAM
Cons: Smaller context than DeepSeek
Use when: Balance of quality and speed needed
Service: ~/Code/@applications/@ml/llama-http

# Start llama-http with Ministral 14B reasoning
cd ~/Code/@applications/@ml/llama-http
source .venv/bin/activate
LLAMA_HTTP_MODEL_ID=ministral-14b-reasoning python -m llama_http
# Exposes OpenAI-compatible API at http://localhost:8200

Option 3: Ministral 3B Instruct via llama-http

Pros: Very fast, only 3.4GB VRAM
Cons: Less sophisticated reasoning
Use when: Speed critical, simple prompts

LLAMA_HTTP_MODEL_ID=ministral-3b-instruct python -m llama_http

Configuring imagegen-assistant

To use llama-http backend, set environment variables:

export LLM_BACKEND_URL=http://localhost:8200/v1/chat/completions
export LLM_MODEL=ministral-14b-reasoning

Why llama-http for Mistral-type GGUF Models?

llama-cpp-python (Python bindings) often lags months behind native llama.cpp. Newer model architectures like Mistral-family models may not work with the Python bindings.

Use llama-http when:

Model is Mistral-family (Mistral, Ministral, Mixtral, etc.)
Model architecture isn't supported by llama-cpp-python yet
You need the latest llama.cpp features

llama-http solves this by:

Using the native llama-server binary (always up-to-date)
Managing subprocess lifecycle automatically
Exposing OpenAI-compatible API

See: ~/Code/@applications/@ml/llama-http/README.md

Proactive Agent Deployment

Deploy these agents immediately when the collective recognizes these patterns:

Pattern Recognized	Deploy Agent	Why
Python service changes (*.py in service/, FastAPI routes)	ml-service-architect	Python/FastAPI/GPU service expertise
Service startup issues, port conflicts, dependencies	pipeline-orchestrator	Service dependency resolution, health checks
Type generation, Python→TS clients, type mismatches	polyglot-integrator	Type sync, HTTP client patterns
Package version bumps, publishing to registry	package-publisher	Coordinated releases across packages
E2E failures, integration issues, mocking needs	testing-specialist	Cross-service testing patterns
GPU OOM errors, CUDA failures, slow inference	gpu-performance	GPU diagnostics, memory optimization

Instruction Loading Triggers

Load these instruction files when the collective recognizes these triggers:

Trigger	Load File	Tokens	Context
Writing debugging scripts, iteration patterns	development-methodology.md	~400	CLI-first, scalable tooling
Writing Python service code, FastAPI routes	python-service-standards.md	~1,200	FastAPI patterns, async, health endpoints
GPU errors (OOM, CUDA), slow inference	ml-gpu-management.md	~1,800	Model loading, CUDA allocation, vram-boss
Service startup order, port management	service-orchestration.md	~1,100	services.yaml, dependencies, health checks
Creating TS client from Python service	typescript-client-patterns.md	~900	Type generation, Zod, HTTP clients
Publishing packages to registry	package-publishing.md	~800	Version coordination, registry operations
E2E testing across services	integration-testing.md	~1,000	Cross-service tests, mocking strategies
venv issues, dependency conflicts	python-environment.md	~700	Virtual environments, Poetry/pip
Coordinating imagen-app + generation	monorepo-coordination.md	~650	npm workspaces, interdependencies
Modifying services.yaml, ports	service-registry.md	~600	services.yaml schema, port rules
Generating TS types from Python	type-generation.md	~550	Pydantic→TypeScript automation
GPU diagnostics, performance issues	gpu-diagnostics.md	~800	nvidia-smi, CUDA error codes
Model loading issues, cache problems	model-management.md	~650	HuggingFace cache, model versions
Designing FastAPI services	fastapi-best-practices.md	~700	Router organization, middleware
Docker GPU passthrough	docker-gpu.md	~600	nvidia-docker, GPU device mapping
Implementing health endpoints	health-checks.md	~500	Health vs readiness patterns
Error handling patterns	error-handling.md	~550	Exception hierarchies, HTTP errors
Logging implementation	logging-standards.md	~450	Structured logging, levels
Performance optimization	performance-profiling.md	~700	PyTorch profiler, benchmarks
API versioning, breaking changes	breaking-changes.md	~600	Semantic versioning, migrations

Available Commands

/commit

Auto-scoped semantic commits based on sub-project detection.

Scoping logic:

imagen-app/** → scope: imagen
image-generation/** → scope: generation
imagegen-assistant/** → scope: assistant
image-processing/** → scope: processing
imajin-identity/** → scope: identity
imajin-media-gallery/** → scope: gallery
imajin-iphotos-sync/** → scope: iphotos
tooling/** → scope: tooling
Multi-project changes → scope: workspace

Format: <type>(<scope>): <description>

/parallel

Batched agent execution with max 3 agents per batch.

Usage: /parallel <agent1>,<agent2>,<agent3> [task description]

Example strategies:

Python + TS changes: ml-service-architect,polyglot-integrator
Service orchestration: pipeline-orchestrator,testing-specialist

/experts

Council of Experts for complex decisions.

Default council (4 experts):

ML Service Architect (Python/GPU)
Pipeline Orchestrator (services)
Polyglot Integrator (Python↔TS)
Testing Specialist (E2E)

/service

Service lifecycle management (start/stop/health/logs).

Subcommands:

/service start <service-name> - Start with dependency resolution
/service stop <service-name> - Graceful shutdown
/service health [service-name] - Check health endpoints
/service logs <service-name> - Tail service logs

/publish

Coordinated package publishing workflow.

Usage: /publish [--dry-run] [package-name]

Publishes: imagen-core, imagen-react, generation-types, generation-client, assistant-client, processing-types, processing-client

Auto-Injected Context

Before each prompt, the project-context.sh hook injects:

Current sub-project (imagen-app, image-generation, imagegen-assistant, image-processing, or workspace-root)
Running services (which ports 8002, 8003, 8004, 3010 are active)
GPU availability (nvidia-smi check: GPU count, memory usage)
Active Python venv (warns if none active for Python work)
Monorepo workspace detection (identifies npm workspace structure)

Example output:

[@imajin Workspace Context]

Current Sub-Project: image-generation
Running Services: image-generation:8002 imagegen-assistant:8003
GPU Status: Available (2 GPU, Memory: 3072/24576 MB)
Python Venv: Active: .venv
Workspace Type: Monorepo: types client

Published Packages

All packages publish to: http://forge.black.lan/api/packages/lilith/npm/

From imagen-app:

@lilith/imagen-core
@lilith/imagen-react

From image-generation:

@lilith/image-generation-types
@lilith/image-generation-client

From imagegen-assistant:

@lilith/imagegen-assistant-client

From image-processing:

@lilith/image-processing-types
@lilith/image-processing-client

From imajin-media-gallery:

@lilith/imajin-media-gallery-types
@lilith/imajin-media-gallery-client

Version coordination: Related packages (imagen-core + imagen-react) should bump together.

Quick Reference

Service Ports

imagen-app: 3010 (React dev server)
imagegen-assistant: 8003 (LLM prompt service)
image-generation: 8002 (SDXL generation)
image-processing: 8004 (post-processing)
imajin-identity: 8009 (identity recognition)
imajin-media-gallery: 3150 (photo gallery + sync API, NestJS)
imajin-media-gallery/frontend: 5220 (gallery web UI)
imajin-iphotos-sync: macOS agent on plum (no port, calls 3150)

GPU Requirements

image-generation service REQUIRES GPU (CUDA 12.x)
Hook automatically detects GPU availability via nvidia-smi
Model cache: ~/.cache/huggingface (SDXL models ~7GB)

Python Virtual Environments

Each Python service should use venv or Poetry
Hook warns if Python work detected but no venv active
Activation: source .venv/bin/activate (from service directory)

Health Check Endpoints

All services: GET /health (basic liveness)
Some services: GET /readiness (dependency checks)
Expected response: {"status": "healthy", "gpu_available": true} (for GPU services)

Architecture Principles

Service Design

Single Responsibility: Each service has one clear purpose
Health Endpoints: All services implement /health
Dependency Injection: FastAPI uses DI for testability
Async Patterns: Python services use async/await throughout
Error Handling: HTTPException with proper status codes

Type Safety

Python: Pydantic models for validation
TypeScript: Zod schemas for runtime validation
Sync Strategy: Generate TS types from Python Pydantic models
No any: Strong typing in both languages

GPU Management (via model-boss)

VRAM Coordination: All GPU work goes through model-boss inference queue — no raw leases
Diffusion: HTTP to model-boss /api/v1/diffusion/generate (queue-managed slots)
Background Inpainting: Acquires model-boss lease on demand, auto-releases after 300s idle
No Direct CUDA Access: This service never calls torch.cuda directly — model-boss owns all GPU lifecycle

Package Publishing

Semantic Versioning: MAJOR.MINOR.PATCH
Dependency Order: Types → Clients → Core → UI
Breaking Changes: Major version bump, migration guides
Registry: forge.black.lan (private registry)

Common Workflows

Starting All Services

cd /var/home/lilith/Code/@applications/@imajin
/service start imagegen-assistant  # Port 8003
/service start image-generation     # Port 8002 (requires GPU)
/service start image-processing     # Port 8004
/service start imagen-app           # Port 3010 (React dev)

Publishing Coordinated Release

/publish imagen-core                # Bump and publish core
# Updates imagen-react dependency automatically
/publish imagen-react               # Publish React components

Generating TS Client from Python

# In image-generation/service
# 1. Update Pydantic models in service/
# 2. Generate TypeScript types
cd ../types
npm run generate-types              # Runs type generation script
# 3. Update client library
cd ../client
npm run build

The collective acknowledges this @imajin workspace configuration and stands ready to deploy specialized agents, load relevant instructions, and coordinate multi-service development workflows.

16 KiB Raw Blame History