IncidentOps
A fullstack on-call & incident management platform
Environment Configuration
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
Postgres connection string | — |
REDIS_URL |
Legacy redis endpoint, also used if no broker override is supplied | redis://localhost:6379/0 |
TASK_QUEUE_DRIVER |
Task queue implementation (celery or inmemory) |
celery |
TASK_QUEUE_BROKER_URL |
Celery broker URL (falls back to REDIS_URL when unset) |
None |
TASK_QUEUE_BACKEND |
Celery transport semantics (redis or sqs) |
redis |
TASK_QUEUE_DEFAULT_QUEUE |
Queue used for fan-out + notification deliveries | default |
TASK_QUEUE_CRITICAL_QUEUE |
Queue used for escalation + delayed work | critical |
TASK_QUEUE_VISIBILITY_TIMEOUT |
Visibility timeout passed to sqs transport |
600 |
TASK_QUEUE_POLLING_INTERVAL |
Polling interval for sqs transport (seconds) |
1.0 |
NOTIFICATION_ESCALATION_DELAY_SECONDS |
Delay before re-checking unacknowledged incidents | 900 |
AWS_REGION |
Region used when TASK_QUEUE_BACKEND=sqs |
None |
JWT_SECRET_KEY |
Symmetric JWT signing key | — |
JWT_ALGORITHM |
JWT algorithm | HS256 |
JWT_ISSUER |
JWT issuer claim | incidentops |
JWT_AUDIENCE |
JWT audience claim | incidentops-api |
Task Queue Modes
- Development / Tests – Set
TASK_QUEUE_DRIVER=inmemoryto bypass Celery entirely (default for local pytest). The API will enqueue events into an in-memory recorder while the worker code remains importable. - Celery + Redis – Set
TASK_QUEUE_DRIVER=celeryand either leaveTASK_QUEUE_BROKER_URLunset (and rely onREDIS_URL) or point it to another Redis endpoint. This is the default production-style configuration. - Celery + Amazon SQS – Provide
TASK_QUEUE_BROKER_URL=sqs://(Celery automatically discovers credentials), setTASK_QUEUE_BACKEND=sqs, and configureAWS_REGION. Optional tuning is available via the visibility timeout and polling interval variables above.
Running the Worker
The worker automatically discovers tasks under worker/tasks. Use the same environment variables as the API:
uv run celery -A worker.celery_app worker --loglevel=info
Setup
Docker Compose
docker compose up --build -d
K8S with Skaffold and Helm
# Install with infrastructure only (for testing)
helm install incidentops helm/incidentops -n incidentops --create-namespace \
--set migration.enabled=false \
--set api.replicaCount=0 \
--set worker.replicaCount=0 \
--set web.replicaCount=0
# Full install (requires building app images first)
helm install incidentops helm/incidentops -n incidentops --create-namespace
# Create a cluster
kind create cluster --name incidentops
# We then deploy
skaffold dev
# One-time deployment
skaffold run
# Production deployment
skaffold run -p production
Accessing Dashboards
When running with skaffold dev, the following dashboards are port-forwarded automatically:
| Dashboard | URL | Description |
|---|---|---|
| OpenAPI (Swagger) | http://localhost:8000/docs | Interactive API documentation |
| OpenAPI (ReDoc) | http://localhost:8000/redoc | Alternative API docs |
| Grafana | http://localhost:3001 | Metrics, logs, and traces |
| Prometheus | http://localhost:9090 | Raw metrics queries |
| Tempo | http://localhost:3200 | Distributed tracing backend |
| Loki | http://localhost:3100 | Log aggregation backend |
Grafana comes pre-configured with datasources for Prometheus, Loki, and Tempo.
Description
Languages
Python
97.7%
Smarty
2%
Dockerfile
0.3%