Operator guide
Documentation for teams deploying and operating the KiCI orchestrator and agent on their own infrastructure. These are the customer-deployed tiers of the three-tier architecture — the orchestrator (Tier 2) handles trigger matching and job dispatch, while agents (Tier 3) clone repos and execute workflow steps.
Quick reference
Section titled “Quick reference”- KiCI environment variable reference — auto-generated catalog of the env vars shared across the orchestrator, agent, and shared logger; per-service variables are documented in each service’s configuration reference. Regenerated from each service’s Zod schema by
pnpm docs:env.
Orchestrator
Section titled “Orchestrator”The customer-deployable orchestrator is the execution brain. It connects to the KiCI Platform relay via WebSocket, receives forwarded webhooks, fetches lock files, matches triggers, and dispatches jobs to agents. Ships as a Docker image with three operating modes: platform, hybrid, and independent.
- Orchestrator — architecture overview and deployment planning
- Deploying the KiCI orchestrator — deployment guide for all three modes
- Orchestrator setup guide — setup wizard, migration, source config
- Config management guide — shared config lifecycle, CLI, reload, rollback
- Configuration reference — environment variables, database setup, mode-specific settings
- kici-admin CLI reference — authentication, RBAC, command reference
- Coordinator/worker deployment — worker mode, P2P setup
- Multi-orchestrator clustering — HA pair, cross-arch pool, dedicated coordinator recipes
- Auto-scaler — Docker, bare-metal, and Firecracker scaler backends, label matching, warm pools
- Firecracker setup guide — Firecracker microVM host setup, YAML config, networking, rootfs, troubleshooting
- Firecracker host setup — host prerequisites, capabilities, network setup, jailer, scaler config
- Firecracker rootfs build guide — build script, kernel config, troubleshooting
The customer-deployable agent is the execution tier. It connects to the orchestrator via WebSocket, receives job dispatches, clones repositories, and runs workflow steps. Ships as a Docker image with label-based job routing.
- Getting started — deployment with Docker, Docker Compose, and Kubernetes
- Configuration reference — environment variables, labels, Docker executor setup
Distribution
Section titled “Distribution”How KiCI packages are distributed and deployed. Covers all three distribution channels (npm packages, OCI container images, Firecracker rootfs), orchestrator deployment modes (container, systemd, launchd, Windows service), agent deployment formats, and agent runtime dependencies.
- Distribution — channels, deployment modes, runtime dependencies
- Multi-architecture builds — build script, manifests, cross-arch deployment
- Service installation guide — systemd, launchd, service management
- KiCI packaging guide — package types, distribution
Operations
Section titled “Operations”- Event routing & generic webhooks — generic sources, trust, event routing config
- Source tarball and dependency caching — S3/filesystem cache setup, build flow, cache keys
- Cancel behavior — cancel config, grace periods, monitoring
- Stale run detection and failure marking — detection system config, tuning, metrics
- Environments — DB tables, Vault config, held runs, monitoring, troubleshooting
Security
Section titled “Security”- Secrets management — setup, admin API, RBAC, access rules, key rotation
- Audit log and data access tracking — three tables, dashboard tabs, CLI queries, retention, support-read flow, troubleshooting
- Agent execution security — sandbox config, isolation backends
- CI security — trust policies, identity linking, approvals
- Peer credential management — peer creds, revocation, re-join
Observability
Section titled “Observability”- Monitoring & tracing — trace fields, Loki queries, health endpoints
- Observability — OTel setup, Prometheus metrics, dashboards
Troubleshooting
Section titled “Troubleshooting”Operator diagnostics for runtime failures that aren’t covered elsewhere. Currently documents the SDK bundle drift diagnostic — a 3-way hash compare (agent / orchestrator / host-published SDK) that collapses the Lock file is out of date investigation from hours to a single log-grep.
- Troubleshooting — SDK bundle drift, hash diagnostic