Files
incidentops/README.md

72 lines
2.9 KiB
Markdown
Raw Normal View History

# IncidentOps
A fullstack on-call & incident management platform
## Environment Configuration
| Variable | Description | Default |
|----------|-------------|---------|
| `DATABASE_URL` | Postgres connection string | — |
| `REDIS_URL` | Legacy redis endpoint, also used if no broker override is supplied | `redis://localhost:6379/0` |
| `TASK_QUEUE_DRIVER` | Task queue implementation (`celery` or `inmemory`) | `celery` |
| `TASK_QUEUE_BROKER_URL` | Celery broker URL (falls back to `REDIS_URL` when unset) | `None` |
| `TASK_QUEUE_BACKEND` | Celery transport semantics (`redis` or `sqs`) | `redis` |
| `TASK_QUEUE_DEFAULT_QUEUE` | Queue used for fan-out + notification deliveries | `default` |
| `TASK_QUEUE_CRITICAL_QUEUE` | Queue used for escalation + delayed work | `critical` |
| `TASK_QUEUE_VISIBILITY_TIMEOUT` | Visibility timeout passed to `sqs` transport | `600` |
| `TASK_QUEUE_POLLING_INTERVAL` | Polling interval for `sqs` transport (seconds) | `1.0` |
| `NOTIFICATION_ESCALATION_DELAY_SECONDS` | Delay before re-checking unacknowledged incidents | `900` |
| `AWS_REGION` | Region used when `TASK_QUEUE_BACKEND=sqs` | `None` |
| `JWT_SECRET_KEY` | Symmetric JWT signing key | — |
| `JWT_ALGORITHM` | JWT algorithm | `HS256` |
| `JWT_ISSUER` | JWT issuer claim | `incidentops` |
| `JWT_AUDIENCE` | JWT audience claim | `incidentops-api` |
### Task Queue Modes
- **Development / Tests** Set `TASK_QUEUE_DRIVER=inmemory` to bypass Celery entirely (default for local pytest). The API will enqueue events into an in-memory recorder while the worker code remains importable.
- **Celery + Redis** Set `TASK_QUEUE_DRIVER=celery` and either leave `TASK_QUEUE_BROKER_URL` unset (and rely on `REDIS_URL`) or point it to another Redis endpoint. This is the default production-style configuration.
- **Celery + Amazon SQS** Provide `TASK_QUEUE_BROKER_URL=sqs://` (Celery automatically discovers credentials), set `TASK_QUEUE_BACKEND=sqs`, and configure `AWS_REGION`. Optional tuning is available via the visibility timeout and polling interval variables above.
### Running the Worker
The worker automatically discovers tasks under `worker/tasks`. Use the same environment variables as the API:
```
uv run celery -A worker.celery_app worker --loglevel=info
```
## Setup
### Docker Compose
```
docker compose up --build -d
```
### K8S with Skaffold and Helm
```
# Install with infrastructure only (for testing)
helm install incidentops helm/incidentops -n incidentops --create-namespace \
--set migration.enabled=false \
--set api.replicaCount=0 \
--set worker.replicaCount=0 \
--set web.replicaCount=0
# Full install (requires building app images first)
helm install incidentops helm/incidentops -n incidentops --create-namespace
# Create a cluster
kind create cluster --name incidentops
# We then deploy
skaffold dev
# One-time deployment
skaffold run
# Production deployment
skaffold run -p production
```