Files
goplt/docs/content/stories/epic1/1.3-health-metrics-system.md

127 lines
3.8 KiB
Markdown

# Story 1.3: Health Monitoring and Metrics System
## Metadata
- **Story ID**: 1.3
- **Title**: Health Monitoring and Metrics System
- **Epic**: 1 - Core Kernel & Infrastructure
- **Status**: Pending
- **Priority**: High
- **Estimated Time**: 5-6 hours
- **Dependencies**: 1.1, 1.2
## Goal
Implement comprehensive health checks and Prometheus metrics for monitoring platform health and performance.
## Description
This story creates a complete health monitoring system with liveness and readiness probes, and a comprehensive Prometheus metrics system for tracking HTTP requests, database queries, and errors.
## Deliverables
### 1. Health Check System
- **HealthChecker Interface** (`pkg/health/health.go`):
- `HealthChecker` interface with `Check(ctx context.Context) error` method
- Health status types
- **Health Registry** (`internal/health/registry.go`):
- Thread-safe registry of health checkers
- Register multiple health checkers
- Aggregate health status
- `GET /healthz` endpoint (liveness probe)
- `GET /ready` endpoint (readiness probe with database check)
- Individual component health checks
### 2. Prometheus Metrics System
- **Metrics Registry** (`internal/metrics/metrics.go`):
- Prometheus registry setup
- HTTP request duration histogram
- HTTP request counter (by method, path, status code)
- Database query duration histogram (via Ent interceptor)
- Error counter (by type)
- Custom metrics support
- **Metrics Endpoint**:
- `GET /metrics` endpoint (Prometheus format)
- Proper content type headers
### 3. Database Health Check
- Database connectivity check
- Connection pool status
- Query execution test
### 4. Integration
- Integration with HTTP server
- Integration with DI container
- Middleware for automatic metrics collection
## Implementation Steps
1. **Install Dependencies**
```bash
go get github.com/prometheus/client_golang/prometheus
```
2. **Create Health Check Interface**
- Create `pkg/health/health.go`
- Define HealthChecker interface
3. **Implement Health Registry**
- Create `internal/health/registry.go`
- Implement registry and endpoints
4. **Create Metrics System**
- Create `internal/metrics/metrics.go`
- Define all metrics
- Create registry
5. **Add Database Health Check**
- Implement database health checker
- Register with health registry
6. **Integrate with HTTP Server**
- Add health endpoints
- Add metrics endpoint
- Add metrics middleware
7. **Integrate with DI**
- Create provider functions
- Register in container
## Acceptance Criteria
- [ ] `/healthz` returns 200 when service is alive
- [ ] `/ready` checks database connectivity and returns appropriate status
- [ ] `/metrics` exposes Prometheus metrics in correct format
- [ ] All HTTP requests are measured
- [ ] Database queries are instrumented
- [ ] Metrics are registered in DI container
- [ ] Health checks can be extended by modules
- [ ] Metrics follow Prometheus naming conventions
## Related ADRs
- [ADR-0014: Health Check Implementation](../../adr/0014-health-check-implementation.md)
## Implementation Notes
- Use Prometheus client library
- Follow Prometheus naming conventions
- Health checks should be fast (< 1 second)
- Metrics should have appropriate labels
- Consider adding custom business metrics in future
## Testing
```bash
# Test health endpoints
curl http://localhost:8080/healthz
curl http://localhost:8080/ready
# Test metrics endpoint
curl http://localhost:8080/metrics
# Test metrics collection
go test ./internal/metrics/...
```
## Files to Create/Modify
- `pkg/health/health.go` - Health checker interface
- `internal/health/registry.go` - Health registry
- `internal/metrics/metrics.go` - Metrics system
- `internal/server/server.go` - Add endpoints
- `internal/di/providers.go` - Add providers