2026-01-21 21:08:08 -05:00
2025-11-21 12:00:00 +00:00
2025-11-21 12:00:00 +00:00
2025-11-21 12:00:00 +00:00
2025-11-21 12:00:00 +00:00
2026-01-21 21:08:08 -05:00
2025-11-21 12:00:00 +00:00

IncidentOps

A fullstack on-call & incident management platform

Environment Configuration

Variable Description Default
DATABASE_URL Postgres connection string
REDIS_URL Legacy redis endpoint, also used if no broker override is supplied redis://localhost:6379/0
TASK_QUEUE_DRIVER Task queue implementation (celery or inmemory) celery
TASK_QUEUE_BROKER_URL Celery broker URL (falls back to REDIS_URL when unset) None
TASK_QUEUE_BACKEND Celery transport semantics (redis or sqs) redis
TASK_QUEUE_DEFAULT_QUEUE Queue used for fan-out + notification deliveries default
TASK_QUEUE_CRITICAL_QUEUE Queue used for escalation + delayed work critical
TASK_QUEUE_VISIBILITY_TIMEOUT Visibility timeout passed to sqs transport 600
TASK_QUEUE_POLLING_INTERVAL Polling interval for sqs transport (seconds) 1.0
NOTIFICATION_ESCALATION_DELAY_SECONDS Delay before re-checking unacknowledged incidents 900
AWS_REGION Region used when TASK_QUEUE_BACKEND=sqs None
JWT_SECRET_KEY Symmetric JWT signing key
JWT_ALGORITHM JWT algorithm HS256
JWT_ISSUER JWT issuer claim incidentops
JWT_AUDIENCE JWT audience claim incidentops-api

Task Queue Modes

  • Development / Tests Set TASK_QUEUE_DRIVER=inmemory to bypass Celery entirely (default for local pytest). The API will enqueue events into an in-memory recorder while the worker code remains importable.
  • Celery + Redis Set TASK_QUEUE_DRIVER=celery and either leave TASK_QUEUE_BROKER_URL unset (and rely on REDIS_URL) or point it to another Redis endpoint. This is the default production-style configuration.
  • Celery + Amazon SQS Provide TASK_QUEUE_BROKER_URL=sqs:// (Celery automatically discovers credentials), set TASK_QUEUE_BACKEND=sqs, and configure AWS_REGION. Optional tuning is available via the visibility timeout and polling interval variables above.

Running the Worker

The worker automatically discovers tasks under worker/tasks. Use the same environment variables as the API:

uv run celery -A worker.celery_app worker --loglevel=info

Setup

Docker Compose

docker compose up --build -d

K8S with Skaffold and Helm

# Install with infrastructure only (for testing)
helm install incidentops helm/incidentops -n incidentops --create-namespace \
  --set migration.enabled=false \
  --set api.replicaCount=0 \
  --set worker.replicaCount=0 \
  --set web.replicaCount=0

# Full install (requires building app images first)
helm install incidentops helm/incidentops -n incidentops --create-namespace

# Create a cluster
kind create cluster --name incidentops

# We then deploy
skaffold dev

# One-time deployment
skaffold run

# Production deployment
skaffold run -p production

Accessing Dashboards

When running with skaffold dev, the following dashboards are port-forwarded automatically:

Dashboard URL Description
OpenAPI (Swagger) http://localhost:8000/docs Interactive API documentation
OpenAPI (ReDoc) http://localhost:8000/redoc Alternative API docs
Grafana http://localhost:3001 Metrics, logs, and traces
Prometheus http://localhost:9090 Raw metrics queries
Tempo http://localhost:3200 Distributed tracing backend
Loki http://localhost:3100 Log aggregation backend

Grafana comes pre-configured with datasources for Prometheus, Loki, and Tempo.

Description
No description provided
Readme 335 KiB
Languages
Python 97.7%
Smarty 2%
Dockerfile 0.3%