Add OpenTelemetry instrumentation with distributed tracing and metrics: - Structured JSON logging with trace context correlation - Auto-instrumentation for FastAPI, asyncpg, httpx, redis - OTLP exporter for traces and Prometheus metrics endpoint Implement Celery worker and notification task system: - Celery app with Redis/SQS broker support and configurable queues - Notification tasks for incident fan-out, webhooks, and escalations - Pluggable TaskQueue abstraction with in-memory driver for testing Add Grafana observability stack (Loki, Tempo, Prometheus, Grafana): - OpenTelemetry Collector for receiving OTLP traces and logs - Tempo for distributed tracing backend - Loki for log aggregation with Promtail DaemonSet - Prometheus for metrics scraping with RBAC configuration - Grafana with pre-provisioned datasources and API overview dashboard - Helm templates for all observability components Enhance application infrastructure: - Global exception handlers with structured ErrorResponse schema - Request logging middleware with timing metrics - Health check updated to verify task queue connectivity - Non-root user in Dockerfile for security - Init containers in Helm deployments for dependency ordering - Production Helm values with autoscaling and retention policies
44 lines
1.1 KiB
Python
44 lines
1.1 KiB
Python
"""Celery application configured for IncidentOps."""
|
|
|
|
from __future__ import annotations
|
|
|
|
from celery import Celery
|
|
from kombu import Queue
|
|
|
|
from app.config import settings
|
|
|
|
|
|
celery_app = Celery("incidentops")
|
|
|
|
|
|
celery_app.conf.update(
|
|
broker_url=settings.resolved_task_queue_broker_url,
|
|
task_default_queue=settings.task_queue_default_queue,
|
|
task_queues=(
|
|
Queue(settings.task_queue_default_queue),
|
|
Queue(settings.task_queue_critical_queue),
|
|
),
|
|
task_routes={
|
|
"worker.tasks.notifications.escalate_if_unacked": {
|
|
"queue": settings.task_queue_critical_queue
|
|
},
|
|
},
|
|
task_serializer="json",
|
|
accept_content=["json"],
|
|
timezone="UTC",
|
|
enable_utc=True,
|
|
)
|
|
|
|
if settings.task_queue_backend == "sqs":
|
|
celery_app.conf.broker_transport_options = {
|
|
"region": settings.aws_region or "us-east-1",
|
|
"visibility_timeout": settings.task_queue_visibility_timeout,
|
|
"polling_interval": settings.task_queue_polling_interval,
|
|
}
|
|
|
|
|
|
celery_app.autodiscover_tasks(["worker.tasks"])
|
|
|
|
|
|
__all__ = ["celery_app"]
|