Files
goplt/docs/content/stories/epic6/6.1-enhanced-observability.md

2.4 KiB

Story 6.1: Enhanced Observability

Metadata

  • Story ID: 6.1
  • Title: Enhanced Observability
  • Epic: 6 - Observability & Production Readiness
  • Status: Pending
  • Priority: High
  • Estimated Time: 6-8 hours
  • Dependencies: 1.6, 5.2, 5.1

Goal

Enhance observability with full OpenTelemetry integration, comprehensive Prometheus metrics expansion, and improved logging with request correlation.

Description

This story enhances the observability system by completing OpenTelemetry integration with all infrastructure components, expanding Prometheus metrics, and improving logging with better correlation and structured fields.

Deliverables

1. Complete OpenTelemetry Integration

  • Export traces to Jaeger/OTLP collector
  • Add database instrumentation (Ent interceptor)
  • Add Kafka instrumentation
  • Add Redis instrumentation
  • Create custom spans:
    • Module initialization spans
    • Background job spans
    • Event publishing spans
  • Trace context propagation:
    • Include trace ID in logs
    • Propagate across HTTP calls
    • Include in error reports

2. Prometheus Metrics Expansion

  • Add more metrics:
    • Database connection pool stats
    • Cache hit/miss ratio
    • Event bus publish/consume rates
    • Background job execution times
    • Module-specific metrics (via module interface)
  • Create metric labels:
    • module label for module metrics
    • tenant_id label (if multi-tenant)
    • status label for error rates

3. Enhanced Logging

  • Add structured fields:
    • user_id from context
    • tenant_id from context
    • module name for module logs
    • trace_id from OpenTelemetry
  • Create log aggregation config:
    • JSON format for production
    • Human-readable for development
    • Support for Loki/CloudWatch/ELK

Acceptance Criteria

  • Traces are exported and visible in Jaeger
  • All infrastructure components are instrumented
  • Trace IDs are included in logs
  • Metrics are expanded with new dimensions
  • Logs include all correlation fields
  • Log aggregation works correctly

Files to Create/Modify

  • internal/observability/tracer.go - Enhanced tracing
  • internal/infra/database/client.go - Add tracing
  • internal/infra/cache/redis_cache.go - Add tracing
  • internal/infra/bus/kafka_bus.go - Add tracing
  • internal/metrics/metrics.go - Expanded metrics
  • internal/logger/zap_logger.go - Enhanced logging