53 lines
1.6 KiB
Markdown
53 lines
1.6 KiB
Markdown
# ADR-0014: Health Check Implementation
|
|
|
|
## Status
|
|
Accepted
|
|
|
|
## Context
|
|
The platform needs health check endpoints for:
|
|
- Kubernetes liveness probes (`/healthz`)
|
|
- Kubernetes readiness probes (`/ready`)
|
|
- Monitoring and alerting
|
|
- Load balancer health checks
|
|
|
|
Health checks should be:
|
|
- Fast and lightweight
|
|
- Check critical dependencies (database, cache, etc.)
|
|
- Provide clear status indicators
|
|
|
|
## Decision
|
|
Implement **custom health check registry** with composable checkers:
|
|
|
|
1. **Liveness endpoint** (`/healthz`): Always returns 200 if process is running
|
|
2. **Readiness endpoint** (`/ready`): Checks all registered health checkers
|
|
3. **Health check interface**: `type HealthChecker interface { Check(ctx context.Context) error }`
|
|
4. **Registry pattern**: Modules can register additional health checkers
|
|
|
|
**Rationale:**
|
|
- Custom implementation gives full control
|
|
- Composable design allows modules to add checks
|
|
- Simple interface is easy to test
|
|
- No external dependency for basic functionality
|
|
- Can extend with Prometheus metrics later
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
- Lightweight and fast
|
|
- Extensible by modules
|
|
- Easy to test
|
|
- Clear separation of liveness vs readiness
|
|
|
|
### Negative
|
|
- Need to implement ourselves (though simple)
|
|
- Must maintain the registry
|
|
|
|
### Implementation Notes
|
|
- Create `pkg/health/health.go` interface
|
|
- Implement `internal/health/registry.go` with checker map
|
|
- Register core checkers: database, cache (if enabled)
|
|
- Add endpoints to HTTP router
|
|
- Return JSON response: `{"status": "ok", "checks": {...}}`
|
|
- Consider timeout (e.g., 5 seconds) for readiness checks
|
|
|