- Verified all acceptance criteria for Stories 1.1-1.6 - Updated Status fields from Pending to Completed - Marked all acceptance criteria checkboxes as completed - All stories in Epic 1 are now fully implemented and verified
3.8 KiB
3.8 KiB
Story 1.3: Health Monitoring and Metrics System
Metadata
- Story ID: 1.3
- Title: Health Monitoring and Metrics System
- Epic: 1 - Core Kernel & Infrastructure
- Status: Completed
- Priority: High
- Estimated Time: 5-6 hours
- Dependencies: 1.1, 1.2
Goal
Implement comprehensive health checks and Prometheus metrics for monitoring platform health and performance.
Description
This story creates a complete health monitoring system with liveness and readiness probes, and a comprehensive Prometheus metrics system for tracking HTTP requests, database queries, and errors.
Deliverables
1. Health Check System
- HealthChecker Interface (
pkg/health/health.go):HealthCheckerinterface withCheck(ctx context.Context) errormethod- Health status types
- Health Registry (
internal/health/registry.go):- Thread-safe registry of health checkers
- Register multiple health checkers
- Aggregate health status
GET /healthzendpoint (liveness probe)GET /readyendpoint (readiness probe with database check)- Individual component health checks
2. Prometheus Metrics System
- Metrics Registry (
internal/metrics/metrics.go):- Prometheus registry setup
- HTTP request duration histogram
- HTTP request counter (by method, path, status code)
- Database query duration histogram (via Ent interceptor)
- Error counter (by type)
- Custom metrics support
- Metrics Endpoint:
GET /metricsendpoint (Prometheus format)- Proper content type headers
3. Database Health Check
- Database connectivity check
- Connection pool status
- Query execution test
4. Integration
- Integration with HTTP server
- Integration with DI container
- Middleware for automatic metrics collection
Implementation Steps
-
Install Dependencies
go get github.com/prometheus/client_golang/prometheus -
Create Health Check Interface
- Create
pkg/health/health.go - Define HealthChecker interface
- Create
-
Implement Health Registry
- Create
internal/health/registry.go - Implement registry and endpoints
- Create
-
Create Metrics System
- Create
internal/metrics/metrics.go - Define all metrics
- Create registry
- Create
-
Add Database Health Check
- Implement database health checker
- Register with health registry
-
Integrate with HTTP Server
- Add health endpoints
- Add metrics endpoint
- Add metrics middleware
-
Integrate with DI
- Create provider functions
- Register in container
Acceptance Criteria
/healthzreturns 200 when service is alive/readychecks database connectivity and returns appropriate status/metricsexposes Prometheus metrics in correct format- All HTTP requests are measured
- Database queries are instrumented
- Metrics are registered in DI container
- Health checks can be extended by modules
- Metrics follow Prometheus naming conventions
Related ADRs
Implementation Notes
- Use Prometheus client library
- Follow Prometheus naming conventions
- Health checks should be fast (< 1 second)
- Metrics should have appropriate labels
- Consider adding custom business metrics in future
Testing
# Test health endpoints
curl http://localhost:8080/healthz
curl http://localhost:8080/ready
# Test metrics endpoint
curl http://localhost:8080/metrics
# Test metrics collection
go test ./internal/metrics/...
Files to Create/Modify
pkg/health/health.go- Health checker interfaceinternal/health/registry.go- Health registryinternal/metrics/metrics.go- Metrics systeminternal/server/server.go- Add endpointsinternal/di/providers.go- Add providers