Orchestrator Deployment Guide
The Orchestrator is a separate process that manages Docker containers for deployed applications.
Overview
The Orchestrator handles app lifecycle management:
- Listens for
app.installedevents -> Pulls images, creates containers - Listens for
app.uninstalledevents -> Stops and removes containers - Monitors container health and restarts as needed
┌─────────────────┐ Events ┌──────────────────┐
│ AppServer │ ───────────────▶│ Orchestrator │
│ (Main API) │ │ (Docker Mgmt) │
└─────────────────┘ └────────┬─────────┘
│ │
│ │
▼ ▼
┌─────────────────┐ ┌──────────────────┐
│ RabbitMQ │ │ Docker Daemon │
│ (Event Bus) │ │ (/var/run/ │
└─────────────────┘ │ docker.sock) │
└──────────────────┘
When to Use
The Orchestrator is REQUIRED for Docker deployments:
- Manages Docker containers for deployed applications
- Handles automatic container lifecycle management
- Provides isolated runtime environments for apps
- Responds to app installation/uninstallation events
Requirements
The Orchestrator has strict requirements:
| Requirement | Description |
|---|---|
| RabbitMQ | REQUIRED - Event bus must be enabled |
| Docker | REQUIRED - Docker daemon access via socket |
| AppServer | Main server must be running and publishing events |
The Orchestrator will fail to start if:
- APPSERVER_EVENTBUS_ENABLED is not true
- APPSERVER_DOCKER_ENABLED is not true
- Docker socket is not accessible
Entry Point
Binary: cmd/orchestrator/main.go
Build and run:
# Build
go build -o orchestrator ./cmd/orchestrator
# Run (requires Docker socket access)
./orchestrator
Environment Variables
Required
# Enable orchestration features
APPSERVER_DOCKER_ENABLED=true
APPSERVER_EVENTBUS_ENABLED=true
# RabbitMQ connection
APPSERVER_RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/
# Docker configuration
APPSERVER_DOCKER_SOCKET_PATH=/var/run/docker.sock
APPSERVER_DOCKER_NETWORK_NAME=appserver-network
Container Registry (for pulling images)
APPSERVER_DOCKER_REGISTRY_URL=registry.eacore6.de
APPSERVER_DOCKER_REGISTRY_USERNAME=your-registry-user
APPSERVER_DOCKER_REGISTRY_PASSWORD=your-registry-password
Resource Limits
APPSERVER_DOCKER_DEFAULT_CPU_SHARES=1024
APPSERVER_DOCKER_DEFAULT_MEMORY_MB=512
Timeouts
APPSERVER_DOCKER_PULL_TIMEOUT=10m
APPSERVER_DOCKER_START_TIMEOUT=2m
APPSERVER_DOCKER_STOP_TIMEOUT=10s
Health Monitoring
APPSERVER_DOCKER_HEALTH_CHECK_INTERVAL=30s
APPSERVER_DOCKER_HEALTH_CHECK_TIMEOUT=5s
APPSERVER_DOCKER_HEALTH_CHECK_RETRIES=3
APPSERVER_DOCKER_MAX_RESTARTS=3
Image Configuration
# Image stage/tag to pull (default: latest)
# Valid values: latest, pre-release, testing
APPSERVER_DOCKER_DEFAULT_IMAGE_STAGE=latest
Startup Reconciliation
# Timeout for startup reconciliation (default: 2m)
APPSERVER_DOCKER_RECONCILE_TIMEOUT=2m
Image Configuration
The orchestrator pulls Docker images from a configurable registry with configurable tags.
Registry URL
Set the container registry URL:
APPSERVER_DOCKER_REGISTRY_URL=registry.eacore6.de
Images are referenced as: {registry_url}/{app_name}:{stage}
Example: registry.eacore6.de/todos:latest
Image Stage/Tag
Control which image tag to pull globally:
# Default: latest
# Options: latest, pre-release, testing
APPSERVER_DOCKER_DEFAULT_IMAGE_STAGE=latest
Per-App Override: Individual apps can override the stage via the Settings API:
mutation {
updateSettings(
appID: "app-uuid"
settings: { "docker.image.stage": "testing" }
) {
success
}
}
This allows testing specific apps with pre-release builds while others remain on stable.
Startup Reconciliation
When the orchestrator starts, it automatically reconciles container state to ensure installed apps are running:
- Queries all apps with state
INSTALLED - Checks if their containers are running in Docker
- Starts any stopped containers
- Waits for health checks to pass
- Updates deployment state in database
Configuration:
# Max time for startup reconciliation (default: 2m)
APPSERVER_DOCKER_RECONCILE_TIMEOUT=2m
This ensures apps recover automatically after orchestrator restarts without waiting for health monitor timeouts.
RabbitMQ Queue Configuration
The Orchestrator uses a separate queue prefix from the AppServer to ensure reliable event delivery.
Why Separate Queues Matter
RabbitMQ uses round-robin delivery when multiple consumers share the same queue. Without separate queues:
- Both AppServer and Orchestrator would compete for the same messages
- The Orchestrator might never receive
app.installedevents - Events could be processed by the wrong service
Queue Naming
| Service | Queue Prefix | Example Queue |
|---|---|---|
| AppServer | appserver.subscriber. | appserver.subscriber.app.installed |
| Orchestrator | orchestrator.subscriber. | orchestrator.subscriber.app.installed |
This configuration is set in pkg/v2/orchestrator/services.go:
rabbitmqConfig.SubscriberPrefix = "orchestrator.subscriber."
Verifying Queue Setup
Check that orchestrator queues exist with correct bindings:
# List orchestrator queues
curl -s -u guest:guest "http://localhost:15672/api/queues" | \
jq '.[] | select(.name | contains("orchestrator")) | .name'
# Check bindings for app.installed
curl -s -u guest:guest \
"http://localhost:15672/api/exchanges/%2f/appserver.events/bindings/source" | \
jq '.[] | select(.destination | contains("orchestrator"))'
Expected bindings:
orchestrator.subscriber.app.installedbound with routing keyapp.installedorchestrator.subscriber.app.uninstalledbound with routing keyapp.uninstalled
Event Flow
Understanding the complete event flow helps with debugging:
┌──────────────┐ 1. installApp ┌──────────────┐
│ Client │ ────────────────────▶│ AppServer │
│ (GraphQL) │ │ │
└──────────────┘ └──────┬───────┘
│
2. Update DB state
3. Publish event
│
▼
┌──────────────┐
│ RabbitMQ │
│ │
└──────┬───────┘
│
4. Route to queue
│
▼
┌──────────────┐ 6. Start ┌──────────────┐
│ Docker │ ◀───────────────────│ Orchestrator │
│ Container │ │ │
└──────────────┘ └──────────────┘
│
5. Extract AppName
from payload
Step-by-Step Flow
- Client Request: User calls
installApp(name: "de.easy-m.statistics")mutation - AppServer Processing:
- Updates app state to
installedin database - Publishes
app.installedevent to RabbitMQ
- Updates app state to
- Event Routing: RabbitMQ routes event to
orchestrator.subscriber.app.installedqueue - Orchestrator Receives: Event handler extracts
AppNamefrom JSON payload - Container Deployment:
- Pulls Docker image from registry
- Creates container with environment variables
- Starts container and waits for health check
- App Connects: Container starts, app connects back to AppServer
Event Subscriptions
The Orchestrator subscribes to these events from RabbitMQ:
app.installed
Triggered when an app is installed via the Marketplace:
- Resolve app dependencies
- Pull Docker image from registry
- Create and configure container
- Start container
- Wait for health check to pass
app.uninstalled
Triggered when an app is uninstalled:
- Stop running container
- Remove container
- Clean up resources
Environment Variables for Containers
When the Orchestrator deploys a container, it passes environment variables from its own environment plus auto-generated app-specific variables.
Environment Variable Mapping
The SDK expects specific variable names. The Orchestrator maps APPSERVER_* variables to the expected names:
| Orchestrator Env | Container Env | Description |
|---|---|---|
APPSERVER_DB_HOST | DB_HOST | Database host |
APPSERVER_DB_PORT | DB_PORT | Database port |
APPSERVER_DB_NAME | DB_NAME | Database name |
APPSERVER_DB_USER | DB_USER | Database username |
APPSERVER_DB_PASSWORD | DB_PASSWORD | Database password |
APPSERVER_DB_SSLMODE | DB_SSL_MODE | Database SSL mode |
Auto-Generated Variables
The Orchestrator automatically generates these variables for each container:
# App identity
APP_ID=<uuid> # Unique app identifier
APP_NAME=de.easy-m.statistics # App name
# AppServer connection
APPSERVER_HTTP_URL=http://appserver:8080
APPSERVER_GRPC_ADDRESS=appserver:9091
APPSERVER_GRAPHQL_URL=http://appserver:8080/graphql
APPSERVER_GRAPHQL_HTTP=http://appserver:8080/graphql # SDK expects this
# App upstream URL (where the app should listen)
UPSTREAM_DE_EASY_M_STATISTICS=http://app-de.easy-m.statistics:3000
# Default port
PORT=3000
Production vs Development Mode
The NODE_ENV variable controls production/development behavior:
# In orchestrator environment
NODE_ENV=production # Apps skip asset building, use pre-built assets
NODE_ENV=development # Apps attempt to build assets (may fail in containers)
Important: Set NODE_ENV=production in the orchestrator environment for deployed containers. Development mode attempts to run build tools that aren't available in production Docker images.
Configuring NODE_ENV
Add to your docker-compose.yml orchestrator service:
orchestrator:
environment:
NODE_ENV: production # Critical for production deployments
# ... other variables
Security: Filtered Variables
For security, only specific prefixes are passed to containers:
Allowed Prefixes:
APPSERVER_DB_*- Scoped database accessAPPSERVER_REDIS_*- Scoped Redis accessAPPSERVER_EVENTBUS_*- Event bus accessAPPSERVER_HTTP_URL/APPSERVER_GRPC_ADDRESS- AppServer endpoints
Blocked (Never Forwarded):
POSTGRES_*- Direct database credentialsREDIS_*- Direct Redis credentialsKRATOS_*/HYDRA_*- Direct auth service access
This ensures third-party apps cannot access infrastructure directly—they must go through the AppServer's scoped APIs.
Container Lifecycle
┌────────────┐ install ┌─────────────┐
│ Pending │ ───────────────▶│ Pulling │
└────────────┘ └──────┬──────┘
│
▼
┌─────────────┐
│ Creating │
└──────┬──────┘
│
▼
┌────────────┐ unhealthy ┌─────────────┐
│ Restarting │ ◀───────────────│ Running │
└──────┬─────┘ └──────┬──────┘
│ │
│ healthy │ uninstall
└──────────────────────────────┼──────────▶ Stopped
│
▼
┌─────────────┐
│ Stopped │
└─────────────┘
Docker Deployment
orchestrator:
image: registry.eacore6.de/orchestrator:latest
environment:
- APPSERVER_DOCKER_ENABLED=true
- APPSERVER_EVENTBUS_ENABLED=true
- APPSERVER_RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/
- APPSERVER_DOCKER_SOCKET_PATH=/var/run/docker.sock
- APPSERVER_DOCKER_NETWORK_NAME=appserver-network
- APPSERVER_DOCKER_REGISTRY_URL=registry.eacore6.de
- APPSERVER_DOCKER_REGISTRY_USERNAME=${REGISTRY_USERNAME}
- APPSERVER_DOCKER_REGISTRY_PASSWORD=${REGISTRY_PASSWORD}
volumes:
# Mount Docker socket for container management
- /var/run/docker.sock:/var/run/docker.sock:rw
depends_on:
rabbitmq:
condition: service_healthy
appserver:
condition: service_healthy
networks:
- appserver-network
Security Considerations
Docker Socket Access
The Orchestrator requires write access to the Docker socket, which grants significant privileges:
- Can create/stop/remove any container
- Can pull images from registries
- Can access container logs and exec
Best Practices:
- Run the Orchestrator on a dedicated host
- Use Docker socket proxy for restricted access
- Monitor Orchestrator logs for anomalies
- Use read-only volumes where possible
Container Isolation
Deployed app containers are isolated:
- Each app runs in its own container
- Containers connect to
appserver-network - Resource limits prevent runaway usage
- Health checks ensure containers are responsive
Monitoring
Logs
# View orchestrator logs
docker logs orchestrator -f
# Filter for specific app
docker logs orchestrator | grep "app_name=todos"
Metrics
The Orchestrator exposes metrics for:
- Container start/stop counts
- Pull durations
- Health check results
- Restart counts
Events
Monitor RabbitMQ queues:
app.installed- Installation requestsapp.uninstalled- Uninstallation requests- Dead-letter queue for failed operations
Troubleshooting
Orchestrator Won't Start
Error: EventBus must be enabled
# Ensure RabbitMQ is enabled
APPSERVER_EVENTBUS_ENABLED=true
APPSERVER_RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/
Error: Docker orchestration must be enabled
# Enable Docker orchestration
APPSERVER_DOCKER_ENABLED=true
Error: Cannot connect to Docker daemon
# Check Docker socket exists and is accessible
ls -la /var/run/docker.sock
# Ensure volume mount is correct
volumes:
- /var/run/docker.sock:/var/run/docker.sock:rw
Container Pull Failures
Error: repository not found
- Verify image name and tag
- Check registry credentials
- Ensure registry is accessible from the host
Error: timeout pulling image
- Increase
APPSERVER_DOCKER_PULL_TIMEOUT - Check network connectivity to registry
- Verify registry is not rate-limiting
Container Won't Start
Error: port already in use
- Check for conflicting containers
- Verify port mappings are unique
Error: out of memory
- Increase
APPSERVER_DOCKER_DEFAULT_MEMORY_MB - Free up host memory
- Check for memory leaks in apps
Health Check Failures
Symptom: Container keeps restarting
- Check app health endpoint is responding
- Verify health check configuration
- Increase
APPSERVER_DOCKER_HEALTH_CHECK_RETRIES - Check container logs for errors:
docker logs <container_name>
Poisoned Messages / Infinite Retry Loop
Symptom: Orchestrator logs show the same event being processed repeatedly with errors
This can happen when a malformed event is published to RabbitMQ and the orchestrator cannot process it, causing infinite redelivery.
Detection:
# Check queue message stats via RabbitMQ Management API
curl -s -u guest:guest "http://localhost:15672/api/queues/%2f/orchestrator.subscriber.app.installed" | \
jq '{messages: .messages, redeliver: .message_stats.redeliver}'
# High redeliver count indicates poisoned messages
# Example output showing problem: {"messages": 1, "redeliver": 6000000}
Resolution - Purge the queue:
# CAUTION: This deletes ALL messages in the queue
curl -X DELETE -u guest:guest \
"http://localhost:15672/api/queues/%2f/orchestrator.subscriber.app.installed/contents"
Prevention:
- Ensure events are published with correct payload format
- The
app.installedevent must includeAppNamein the JSON payload:{"AppName": "de.easy-m.statistics", "AppID": "uuid-here"} - Both
AppName(PascalCase) andapp_name(snake_case) are supported
Stale Deployments
Symptom: App shows as "installed" in database but container doesn't exist
This can happen if:
- Container was manually deleted
- Docker daemon restarted and container was not set to restart
- Host system rebooted
Detection:
# Check deployment state in database
docker exec -it postgres psql -U partner -d partner -c \
"SELECT a.name, d.state, d.container_id, d.health_status
FROM appserver.deployments d
JOIN appserver.apps a ON d.app_id = a.id
WHERE d.state = 'healthy';"
# Verify containers actually exist
docker ps --filter "name=app-" --format "{{.Names}}"
Auto-Recovery: The orchestrator automatically handles stale deployments:
- On startup, it reconciles all installed apps
- When processing install events, it verifies container existence
- If a deployment shows "healthy" but container doesn't exist, it:
- Marks the deployment as failed
- Triggers a fresh deployment
Manual Recovery: If auto-recovery doesn't work:
# 1. Mark deployment as failed in database
docker exec -it postgres psql -U partner -d partner -c \
"UPDATE appserver.deployments SET state = 'failed'
WHERE container_id = 'missing-container-id';"
# 2. Trigger reinstall via GraphQL
curl -X POST http://localhost:8080/graphql \
-H "Content-Type: application/json" \
-H "Cookie: ory_kratos_session=<session>" \
-d '{"query": "mutation { installApp(name: \"de.easy-m.statistics\") { id } }"}'
Event Not Received by Orchestrator
Symptom: installApp mutation succeeds but orchestrator doesn't deploy container
Check 1: Verify AppServer is publishing events
# Check appserver logs for event publishing
docker logs appserver | grep "app.installed"
Check 2: Verify RabbitMQ bindings
# List orchestrator queue bindings
curl -s -u guest:guest \
"http://localhost:15672/api/exchanges/%2f/appserver.events/bindings/source" | \
jq '.[] | select(.destination | contains("orchestrator"))'
# Expected: binding with routing_key "app.installed"
Check 3: Verify queue separation
The orchestrator MUST use a different queue prefix than the appserver. Without this, both services share the same queue and RabbitMQ round-robins messages between them.
# List all subscriber queues
curl -s -u guest:guest "http://localhost:15672/api/queues" | \
jq '.[].name | select(contains("subscriber"))'
# Expected output should include BOTH:
# - "appserver.subscriber.app.installed"
# - "orchestrator.subscriber.app.installed"
If only appserver.subscriber.* queues exist, check pkg/v2/orchestrator/services.go:
rabbitmqConfig.SubscriberPrefix = "orchestrator.subscriber." // REQUIRED
Check 4: Verify orchestrator is connected
# Check RabbitMQ consumers
curl -s -u guest:guest \
"http://localhost:15672/api/queues/%2f/orchestrator.subscriber.app.installed" | \
jq '.consumer_details'
# Should show at least one consumer
App Container Configuration Errors
Error: database configuration is required
- The SDK expects
DB_HOST, notAPPSERVER_DB_HOST - The orchestrator should automatically map these variables
- Verify env_builder.go includes the DB mappings
Error: AppServerURL is required
- The SDK expects
APPSERVER_GRAPHQL_HTTPenvironment variable - Verify this is set in container environment:
docker inspect app-de.easy-m.statistics | jq '.[0].Config.Env'
Error: NODE_ENV=development causing build failures
- Production containers should run with
NODE_ENV=production - Add to orchestrator environment in docker-compose:
NODE_ENV: production
Scaling Considerations
Multiple Orchestrators
For high availability, you can run multiple Orchestrator instances:
- Use competing consumers on RabbitMQ
- Each instance handles different events
- Ensure container operations are idempotent
Distributed Docker Hosts
For scaling across multiple machines:
- Use Docker Swarm or Kubernetes instead
- Or run one Orchestrator per Docker host
- Coordinate via shared RabbitMQ
Related Topics
- AppServer Guide - Main backend service
- Docker Infrastructure - Full stack setup
- Environment Reference - All environment variables