docs: add mkdocs, update links, add architecture documentation
This commit is contained in:
56
docs/content/adr/0016-opentelemetry-observability.md
Normal file
56
docs/content/adr/0016-opentelemetry-observability.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# ADR-0016: OpenTelemetry Observability Strategy
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
The platform needs distributed tracing and observability for:
|
||||
- Request tracing across services/modules
|
||||
- Performance monitoring
|
||||
- Debugging production issues
|
||||
- Integration with observability tools (Jaeger, Grafana, etc.)
|
||||
|
||||
Options considered:
|
||||
1. **OpenTelemetry** - Industry standard, vendor-neutral
|
||||
2. **Zipkin** - Older standard, less ecosystem support
|
||||
3. **Custom tracing** - Build our own
|
||||
4. **No tracing** - Only logs and metrics
|
||||
|
||||
## Decision
|
||||
Use **OpenTelemetry (OTEL)** for all observability:
|
||||
|
||||
1. **Tracing**: Distributed tracing with spans
|
||||
2. **Metrics**: Prometheus-compatible metrics
|
||||
3. **Logs**: Structured logs with trace correlation
|
||||
4. **Export**: OTLP collector for production, stdout for development
|
||||
|
||||
**Rationale:**
|
||||
- Industry standard, vendor-neutral
|
||||
- Excellent Go SDK support
|
||||
- Integrates with major observability tools
|
||||
- Supports metrics, traces, and logs
|
||||
- Recommended in playbook-golang.md
|
||||
- Future-proof (not locked to specific vendor)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Vendor-neutral (can switch backends)
|
||||
- Rich ecosystem and tooling
|
||||
- Excellent Go SDK
|
||||
- Supports all observability signals
|
||||
|
||||
### Negative
|
||||
- Learning curve for OpenTelemetry concepts
|
||||
- Slight overhead (minimal with sampling)
|
||||
- Requires OTLP collector or compatible backend
|
||||
|
||||
### Implementation Notes
|
||||
- Install: `go.opentelemetry.io/otel` and contrib packages
|
||||
- Initialize TracerProvider in `internal/observability/tracer.go`
|
||||
- Use HTTP instrumentation middleware: `otelhttp.NewHandler()`
|
||||
- Add database instrumentation via Ent interceptor
|
||||
- Export to stdout for development, OTLP for production
|
||||
- Include trace ID in structured logs
|
||||
- Configure sampling for production (e.g., 10% or adaptive)
|
||||
|
||||
Reference in New Issue
Block a user