Variable	Description	Default
`DATABASE_URL`	Postgres connection string	—
`REDIS_URL`	Legacy redis endpoint, also used if no broker override is supplied	`redis://localhost:6379/0`
`TASK_QUEUE_DRIVER`	Task queue implementation (`celery` or `inmemory`)	`celery`
`TASK_QUEUE_BROKER_URL`	Celery broker URL (falls back to `REDIS_URL` when unset)	`None`
`TASK_QUEUE_BACKEND`	Celery transport semantics (`redis` or `sqs`)	`redis`
`TASK_QUEUE_DEFAULT_QUEUE`	Queue used for fan-out + notification deliveries	`default`
`TASK_QUEUE_CRITICAL_QUEUE`	Queue used for escalation + delayed work	`critical`
`TASK_QUEUE_VISIBILITY_TIMEOUT`	Visibility timeout passed to `sqs` transport	`600`
`TASK_QUEUE_POLLING_INTERVAL`	Polling interval for `sqs` transport (seconds)	`1.0`
`NOTIFICATION_ESCALATION_DELAY_SECONDS`	Delay before re-checking unacknowledged incidents	`900`
`AWS_REGION`	Region used when `TASK_QUEUE_BACKEND=sqs`	`None`
`JWT_SECRET_KEY`	Symmetric JWT signing key	—
`JWT_ALGORITHM`	JWT algorithm	`HS256`
`JWT_ISSUER`	JWT issuer claim	`incidentops`
`JWT_AUDIENCE`	JWT audience claim	`incidentops-api`

Task Queue Modes

Development / Tests – Set TASK_QUEUE_DRIVER=inmemory to bypass Celery entirely (default for local pytest). The API will enqueue events into an in-memory recorder while the worker code remains importable.
Celery + Redis – Set TASK_QUEUE_DRIVER=celery and either leave TASK_QUEUE_BROKER_URL unset (and rely on REDIS_URL) or point it to another Redis endpoint. This is the default production-style configuration.
Celery + Amazon SQS – Provide TASK_QUEUE_BROKER_URL=sqs:// (Celery automatically discovers credentials), set TASK_QUEUE_BACKEND=sqs, and configure AWS_REGION. Optional tuning is available via the visibility timeout and polling interval variables above.

Running the Worker

The worker automatically discovers tasks under worker/tasks. Use the same environment variables as the API:

uv run celery -A worker.celery_app worker --loglevel=info

Setup

Docker Compose

docker compose up --build -d

K8S with Skaffold and Helm

# Install with infrastructure only (for testing)
helm install incidentops helm/incidentops -n incidentops --create-namespace \
  --set migration.enabled=false \
  --set api.replicaCount=0 \
  --set worker.replicaCount=0 \
  --set web.replicaCount=0

# Full install (requires building app images first)
helm install incidentops helm/incidentops -n incidentops --create-namespace

# Create a cluster
kind create cluster --name incidentops

# We then deploy
skaffold dev

# One-time deployment
skaffold run

# Production deployment
skaffold run -p production

Accessing Dashboards

When running with skaffold dev, the following dashboards are port-forwarded automatically:

Dashboard	URL	Description
OpenAPI (Swagger)	http://localhost:8000/docs	Interactive API documentation
OpenAPI (ReDoc)	http://localhost:8000/redoc	Alternative API docs
Grafana	http://localhost:3001	Metrics, logs, and traces
Prometheus	http://localhost:9090	Raw metrics queries
Tempo	http://localhost:3200	Distributed tracing backend
Loki	http://localhost:3100	Log aggregation backend

Grafana comes pre-configured with datasources for Prometheus, Loki, and Tempo.

README.md Unescape Escape

IncidentOps

Environment Configuration