docs: add mkdocs, update links, add architecture documentation

This commit is contained in:
2025-11-05 07:44:21 +01:00
parent 6a17236474
commit 54a047f5dc
351 changed files with 3482 additions and 10 deletions

View File

@@ -0,0 +1,37 @@
# ADR-0001: Go Module Path
## Status
Accepted
## Context
The project needs a Go module path that uniquely identifies the platform. This path will be used:
- In `go.mod` file
- For importing packages within the project
- For module dependencies
- For future module publishing
## Decision
Use `git.dcentral.systems/toolz/goplt` as the Go module path.
**Rationale:**
- Matches the organization's Git hosting structure
- Follows Go module naming conventions
- Clearly identifies the project as a Go platform tool
- Prevents naming conflicts with other modules
## Consequences
### Positive
- Clear, descriptive module path
- Aligns with organization's infrastructure
- Easy to identify in dependency graphs
### Negative
- Requires access to `git.dcentral.systems` for module resolution
- May need to configure GOPRIVATE/GONOPROXY if using private registry
### Implementation Notes
- Initialize module: `go mod init git.dcentral.systems/toolz/goplt`
- Update all import paths in code to use this module path
- Configure `.git/config` or Go environment variables if needed for private module access

View File

@@ -0,0 +1,39 @@
# ADR-0002: Go Version
## Status
Accepted
## Context
Go releases new versions regularly with new features, performance improvements, and security fixes. We need to choose a Go version that:
- Provides necessary features for the platform
- Has good ecosystem support
- Is stable and production-ready
- Supports required tooling (plugins, etc.)
## Decision
Use **Go 1.24.3** as the minimum required version for the platform.
**Rationale:**
- Latest stable version available
- Provides all required features for the platform
- Ensures compatibility with modern Go tooling
- Supports all planned features (modules, plugins, generics)
## Consequences
### Positive
- Access to latest Go features and performance improvements
- Better security with latest patches
- Modern tooling support
### Negative
- Requires developers to have Go 1.24.3+ installed
- CI/CD must use compatible Go version
- May limit compatibility with some older dependencies (if any)
### Implementation Notes
- Specify in `go.mod`: `go 1.24`
- Document in `README.md` and CI configuration
- Update `.github/workflows/ci.yml` to use `actions/setup-go@v5` with version `1.24.3`
- Add version check script if needed

View File

@@ -0,0 +1,49 @@
# ADR-0003: Dependency Injection Framework
## Status
Accepted
## Context
The platform requires dependency injection to:
- Manage service lifecycle
- Wire dependencies between components
- Support module system initialization
- Handle graceful shutdown
- Provide testability through dependency substitution
Options considered:
1. **uber-go/fx** - Runtime dependency injection with lifecycle management
2. **uber-go/dig** - Compile-time dependency injection
3. **Manual constructor injection** - No framework, explicit wiring
## Decision
Use **uber-go/fx** (v1.23.0+) as the dependency injection framework.
**Rationale:**
- Provides lifecycle management (OnStart/OnStop hooks) crucial for services
- Supports module-based architecture through fx.Option composition
- Runtime dependency resolution with compile-time type safety
- Excellent for modular monolith architecture
- Well-documented and actively maintained
- Used by major Go projects (Uber, etc.)
## Consequences
### Positive
- Clean lifecycle management for services
- Easy module composition via fx.Option
- Graceful shutdown handling built-in
- Test-friendly with fx.Options for test overrides
### Negative
- Runtime reflection overhead (minimal)
- Learning curve for developers unfamiliar with fx
- Slightly more complex error messages on dependency resolution failures
### Implementation Notes
- Install: `go get go.uber.org/fx@v1.23.0`
- Create `internal/di/container.go` with fx.New()
- Use fx.Provide() for service registration
- Use fx.Invoke() for initialization tasks
- Leverage fx.Lifecycle for service startup/shutdown

View File

@@ -0,0 +1,50 @@
# ADR-0004: Configuration Management Library
## Status
Accepted
## Context
The platform needs a configuration system that:
- Supports hierarchical configuration (defaults → files → env → secrets)
- Handles multiple formats (YAML, JSON, env vars)
- Provides type-safe access to configuration values
- Supports environment-specific overrides
- Can integrate with secret managers (future)
Options considered:
1. **spf13/viper** - Comprehensive configuration management
2. **envconfig** - Environment variable only
3. **koanf** - Lightweight configuration library
4. **Standard library + manual parsing** - No external dependency
## Decision
Use **spf13/viper** (v1.18.0+) with **spf13/cobra** (v1.8.0+) for configuration management.
**Rationale:**
- Industry standard for Go configuration management
- Supports multiple sources (files, env vars, flags)
- Hierarchical configuration with precedence rules
- Easy integration with Cobra for CLI commands
- Well-documented and widely used
- Supports future secret manager integration
## Consequences
### Positive
- Flexible configuration loading from multiple sources
- Easy to add new configuration sources
- Type-safe access methods
- Environment variable support via automatic env binding
### Negative
- Additional dependency
- Viper can be verbose for simple use cases
- Some learning curve for advanced features
### Implementation Notes
- Install: `go get github.com/spf13/viper@v1.18.0` and `github.com/spf13/cobra@v1.8.0`
- Create `pkg/config/config.go` interface to abstract Viper
- Implement `internal/config/viper_config.go` as concrete implementation
- Load order: `default.yaml``development.yaml`/`production.yaml` → env vars → secrets (future)
- Use typed getters (GetString, GetInt, GetBool) for type safety

View File

@@ -0,0 +1,50 @@
# ADR-0005: Logging Framework
## Status
Accepted
## Context
The platform requires structured logging that:
- Supports multiple log levels
- Provides structured output (JSON for production)
- Allows adding contextual fields
- Performs well under load
- Integrates with observability tools
Options considered:
1. **go.uber.org/zap** - High-performance structured logging
2. **rs/zerolog** - Zero-allocation logger
3. **sirupsen/logrus** - Structured logger (maintenance mode)
4. **Standard library log** - Basic logging (insufficient)
## Decision
Use **go.uber.org/zap** (v1.26.0+) as the logging framework.
**Rationale:**
- Industry standard for high-performance Go applications
- Excellent structured logging with field support
- Very low overhead (designed for high-throughput systems)
- JSON output for production, human-readable for development
- Strong ecosystem integration
- Actively maintained by Uber
## Consequences
### Positive
- High performance (low latency, high throughput)
- Rich structured logging with fields
- Easy integration with observability tools
- Configurable output formats (JSON/console)
### Negative
- Slightly more verbose API than standard library
- Requires wrapping for common use cases (we'll abstract via interface)
### Implementation Notes
- Install: `go get go.uber.org/zap@v1.26.0`
- Create `pkg/logger/logger.go` interface to abstract zap
- Implement `internal/logger/zap_logger.go` as concrete implementation
- Use JSON encoder for production, console encoder for development
- Support request-scoped fields via context
- Export global logger via `pkg/logger` package

View File

@@ -0,0 +1,50 @@
# ADR-0006: HTTP Framework
## Status
Accepted
## Context
The platform needs an HTTP framework for:
- REST API endpoints
- Middleware support (auth, logging, metrics)
- Request/response handling
- Route registration from modules
- Integration with observability tools
Options considered:
1. **gin-gonic/gin** - Fast, feature-rich HTTP web framework
2. **gorilla/mux** - Lightweight router
3. **go-chi/chi** - Lightweight, idiomatic router
4. **net/http** (standard library) - No external dependency
## Decision
Use **gin-gonic/gin** (v1.9.1+) as the HTTP framework.
**Rationale:**
- Fast performance (comparable to net/http)
- Rich middleware ecosystem
- Excellent for REST APIs
- Easy route grouping (useful for modules)
- Good OpenTelemetry integration support
- Widely used and well-documented
- Recommended in playbook-golang.md
## Consequences
### Positive
- High performance
- Easy middleware chaining
- Route grouping supports module architecture
- Good ecosystem support
### Negative
- Additional dependency (though lightweight)
- Slight learning curve for developers unfamiliar with Gin
### Implementation Notes
- Install: `go get github.com/gin-gonic/gin@v1.9.1`
- Create router in `internal/server/server.go`
- Use route groups for module isolation: `r.Group("/api/v1/blog")`
- Add middleware stack: logging, recovery, metrics, auth (later)
- Support graceful shutdown via fx lifecycle

View File

@@ -0,0 +1,82 @@
# ADR-0007: Project Directory Structure
## Status
Accepted
## Context
The project needs a clear, scalable directory structure that:
- Follows Go best practices
- Separates public interfaces from implementations
- Supports modular architecture
- Is maintainable and discoverable
- Aligns with Go community standards
## Decision
Adopt a **standard Go project layout** with **internal/** and **pkg/** separation:
```
goplt/
├── cmd/
│ └── platform/ # Application entry point
├── internal/ # Private implementation code
│ ├── di/ # Dependency injection
│ ├── registry/ # Module registry
│ ├── pluginloader/ # Plugin loader (optional)
│ ├── config/ # Config implementation
│ ├── logger/ # Logger implementation
│ └── infra/ # Infrastructure adapters
├── pkg/ # Public interfaces (exported)
│ ├── config/ # ConfigProvider interface
│ ├── logger/ # Logger interface
│ ├── module/ # IModule interface
│ ├── auth/ # Auth interfaces (Phase 2)
│ ├── perm/ # Permission DSL (Phase 2)
│ └── infra/ # Infrastructure interfaces
├── modules/ # Feature modules
│ └── blog/ # Sample module (Phase 4)
├── config/ # Configuration files
│ ├── default.yaml
│ ├── development.yaml
│ └── production.yaml
├── api/ # OpenAPI specs
├── scripts/ # Build/test scripts
├── docs/ # Documentation
│ └── adr/ # Architecture Decision Records
├── ops/ # Operations (Grafana dashboards, etc.)
├── .github/
│ └── workflows/
│ └── ci.yml
├── Dockerfile
├── docker-compose.yml
├── docker-compose.test.yml
└── go.mod
```
**Rationale:**
- `internal/` prevents external packages from importing implementation details
- `pkg/` exposes only interfaces that modules need
- `cmd/` follows Go standard for application entry points
- `modules/` clearly separates feature modules
- `config/` centralizes configuration files
- Separates concerns and supports clean architecture
## Consequences
### Positive
- Clear separation of concerns
- Prevents circular dependencies
- Easy to navigate and understand
- Aligns with Go community standards
- Supports modular architecture
### Negative
- Slightly more directories than minimal structure
- Requires discipline to maintain boundaries
### Implementation Notes
- Initialize with `go mod init git.dcentral.systems/toolz/goplt`
- Create all directories upfront in Phase 0
- Document structure in `README.md`
- Enforce boundaries via `internal/` package visibility
- Use `go build ./...` to verify structure

View File

@@ -0,0 +1,57 @@
# ADR-0008: Error Handling Strategy
## Status
Accepted
## Context
Go's error handling philosophy requires explicit error checking. We need a consistent approach for:
- Error creation and wrapping
- Error propagation
- Error classification (domain vs infrastructure)
- Error reporting (logging, monitoring)
- HTTP error responses
## Decision
Adopt a **wrapped error pattern** with **structured error types**:
1. **Error Wrapping**: Use `fmt.Errorf("context: %w", err)` for error wrapping
2. **Error Types**: Define custom error types for domain errors
3. **Error Classification**: Distinguish between:
- Domain errors (business logic failures)
- Infrastructure errors (external system failures)
- Validation errors (input validation failures)
4. **Error Context**: Always wrap errors with context about where they occurred
**Rationale:**
- Follows Go 1.13+ error wrapping best practices
- Enables error inspection with `errors.Is()` and `errors.As()`
- Maintains error chains for debugging
- Allows structured error handling
## Consequences
### Positive
- Full error traceability through call stack
- Can inspect and handle specific error types
- Better debugging with error context
- Aligns with Go best practices
### Negative
- Requires discipline to wrap errors consistently
- Can be verbose in some cases
### Implementation Notes
- Always wrap errors: `return nil, fmt.Errorf("failed to load config: %w", err)`
- Create error types for domain errors:
```go
type ConfigError struct {
Key string
Cause error
}
func (e *ConfigError) Error() string { ... }
func (e *ConfigError) Unwrap() error { return e.Cause }
```
- Use `errors.Is()` and `errors.As()` for error checking
- Log errors with context before returning
- Map domain errors to HTTP status codes in handlers

View File

@@ -0,0 +1,56 @@
# ADR-0009: Context Key Types
## Status
Accepted
## Context
The platform will use `context.Context` to propagate request-scoped values such as:
- User ID (from authentication)
- Request ID (for tracing)
- Tenant ID (for multi-tenancy)
- Logger instance (with request-scoped fields)
Go best practices recommend using typed keys instead of string keys to avoid collisions.
## Decision
Use **typed context keys** for all context values:
```go
type contextKey string
const (
userIDKey contextKey = "user_id"
requestIDKey contextKey = "request_id"
tenantIDKey contextKey = "tenant_id"
loggerKey contextKey = "logger"
)
```
**Rationale:**
- Prevents key collisions between packages
- Type-safe access to context values
- Aligns with Go best practices (see `context.WithValue` documentation)
- Makes context usage explicit and discoverable
## Consequences
### Positive
- Type-safe context access
- Prevents accidental key collisions
- Clear intent in code
- Better IDE support
### Negative
- Slightly more verbose than string keys
- Requires defining keys upfront
### Implementation Notes
- Create `pkg/context/keys.go` with all context key definitions
- Provide helper functions for setting/getting values:
```go
func WithUserID(ctx context.Context, userID string) context.Context
func UserIDFromContext(ctx context.Context) (string, bool)
```
- Use in middleware and services
- Document all context keys and their usage

View File

@@ -0,0 +1,50 @@
# ADR-0010: CI/CD Platform
## Status
Accepted
## Context
The platform needs a CI/CD system for:
- Automated testing on pull requests
- Code quality checks (linting, formatting)
- Building binaries and Docker images
- Publishing artifacts
- Running integration tests
Options considered:
1. **GitHub Actions** - Native GitHub integration
2. **GitLab CI** - If using GitLab
3. **Jenkins** - Self-hosted option
4. **CircleCI** - Cloud-based CI/CD
## Decision
Use **GitHub Actions** for CI/CD pipeline.
**Rationale:**
- Native integration with GitHub repositories
- Free for public repos, reasonable for private
- Rich ecosystem of actions
- Easy to configure with YAML
- Good documentation and community support
- Recommended in playbook-golang.md
## Consequences
### Positive
- Easy setup and configuration
- Good GitHub integration
- Large action marketplace
- Free for public repositories
### Negative
- Tied to GitHub (if migrating Git hosts, need to migrate CI)
- Limited customization compared to self-hosted solutions
### Implementation Notes
- Create `.github/workflows/ci.yml`
- Use `actions/setup-go@v5` for Go setup
- Configure caching for Go modules
- Run: linting, unit tests, integration tests, build
- Use `actions/cache@v4` for module caching
- Add build matrix if needed for multiple Go versions (future)

View File

@@ -0,0 +1,53 @@
# ADR-0011: Code Generation Tools
## Status
Accepted
## Context
The platform will use code generation for:
- Permission constants from module manifests
- Ent ORM code generation
- Mock generation for testing
- OpenAPI client/server code (future)
We need to decide on tooling and workflow.
## Decision
Use **standard Go generation tools** with `go generate`:
1. **Ent ORM**: `entgo.io/ent/cmd/ent` for schema code generation
2. **Mocks**: `github.com/vektra/mockery/v2` or `github.com/golang/mock/mockgen`
3. **Permissions**: Custom `scripts/generate-permissions.go`
4. **OpenAPI**: `github.com/deepmap/oapi-codegen` (future)
**Workflow:**
- Use `//go:generate` directives in source files
- Run `go generate ./...` before commits
- Document in `Makefile` with `make generate` target
- CI should verify generated code is up-to-date
**Rationale:**
- Standard Go tooling, well-supported
- `go generate` is the idiomatic way to run code generation
- Easy to integrate into CI/CD
- Reduces manual code maintenance
## Consequences
### Positive
- Automated code generation reduces errors
- Consistent code style
- Easy to maintain
- Standard Go workflow
### Negative
- Requires developers to run generation before commits
- Generated code must be committed (or verified in CI)
- Slight learning curve for new developers
### Implementation Notes
- Add `//go:generate` directives where needed
- Create `Makefile` target: `make generate`
- Add CI step to verify generated code: `go generate ./... && git diff --exit-code`
- Document in `CONTRIBUTING.md`

View File

@@ -0,0 +1,62 @@
# ADR-0012: Logger Interface Design
## Status
Accepted
## Context
We're using zap for logging, but want to abstract it behind an interface for:
- Testability (mock logger in tests)
- Flexibility (could swap implementations)
- Module compatibility (modules use interface, not concrete type)
We need to decide on the interface design.
## Decision
Create a **simple logger interface** that matches zap's API pattern but uses generic types:
```go
type Field interface {
// Field represents a key-value pair for structured logging
}
type Logger interface {
Debug(msg string, fields ...Field)
Info(msg string, fields ...Field)
Warn(msg string, fields ...Field)
Error(msg string, fields ...Field)
With(fields ...Field) Logger
}
```
**Implementation:**
- Use `zap.Field` as the Field type (no abstraction needed for now)
- Provide helper functions in `pkg/logger` for creating fields:
```go
func String(key, value string) Field
func Int(key string, value int) Field
func Error(err error) Field
```
**Rationale:**
- Simple interface that modules can depend on
- Matches zap's usage patterns
- Easy to test with mock implementations
- Allows future swap if needed (though unlikely)
## Consequences
### Positive
- Clean abstraction for modules
- Testable with mocks
- Simple API for modules to use
### Negative
- Slight indirection overhead
- Need to maintain interface compatibility
### Implementation Notes
- Define interface in `pkg/logger/logger.go`
- Implement in `internal/logger/zap_logger.go`
- Export helper functions in `pkg/logger/fields.go`
- Modules import `pkg/logger`, not `internal/logger`

View File

@@ -0,0 +1,54 @@
# ADR-0013: Database ORM Selection
## Status
Accepted
## Context
The platform needs a database ORM/library that:
- Supports PostgreSQL (primary database)
- Provides type-safe query building
- Supports code generation (reduces boilerplate)
- Handles migrations
- Supports relationships (many-to-many, etc.)
- Integrates with Ent (code generation)
Options considered:
1. **entgo.io/ent** - Code-generated, type-safe ORM
2. **gorm.io/gorm** - Feature-rich ORM with reflection
3. **sqlx** - Lightweight wrapper around database/sql
4. **Standard library database/sql** - No ORM, raw SQL
## Decision
Use **entgo.io/ent** as the primary ORM for the platform.
**Rationale:**
- Code generation provides compile-time type safety
- Excellent schema definition and migration support
- Strong relationship modeling
- Good performance (no reflection at runtime)
- Active development and good documentation
- Recommended in playbook-golang.md
- Easy to integrate with OpenTelemetry
## Consequences
### Positive
- Type-safe queries eliminate runtime errors
- Schema changes are explicit and versioned
- Code generation reduces boilerplate
- Good migration support
- Strong relationship support
### Negative
- Requires code generation step (`go generate`)
- Learning curve for developers unfamiliar with Ent
- Less flexible than raw SQL for complex queries
- Generated code must be committed or verified in CI
### Implementation Notes
- Install: `go get entgo.io/ent/cmd/ent`
- Initialize schema: `go run entgo.io/ent/cmd/ent init User Role Permission`
- Use `//go:generate` directives for code generation
- Run migrations on startup via `client.Schema.Create()`
- Create wrapper in `internal/infra/database/client.go` for DI injection

View File

@@ -0,0 +1,52 @@
# ADR-0014: Health Check Implementation
## Status
Accepted
## Context
The platform needs health check endpoints for:
- Kubernetes liveness probes (`/healthz`)
- Kubernetes readiness probes (`/ready`)
- Monitoring and alerting
- Load balancer health checks
Health checks should be:
- Fast and lightweight
- Check critical dependencies (database, cache, etc.)
- Provide clear status indicators
## Decision
Implement **custom health check registry** with composable checkers:
1. **Liveness endpoint** (`/healthz`): Always returns 200 if process is running
2. **Readiness endpoint** (`/ready`): Checks all registered health checkers
3. **Health check interface**: `type HealthChecker interface { Check(ctx context.Context) error }`
4. **Registry pattern**: Modules can register additional health checkers
**Rationale:**
- Custom implementation gives full control
- Composable design allows modules to add checks
- Simple interface is easy to test
- No external dependency for basic functionality
- Can extend with Prometheus metrics later
## Consequences
### Positive
- Lightweight and fast
- Extensible by modules
- Easy to test
- Clear separation of liveness vs readiness
### Negative
- Need to implement ourselves (though simple)
- Must maintain the registry
### Implementation Notes
- Create `pkg/health/health.go` interface
- Implement `internal/health/registry.go` with checker map
- Register core checkers: database, cache (if enabled)
- Add endpoints to HTTP router
- Return JSON response: `{"status": "ok", "checks": {...}}`
- Consider timeout (e.g., 5 seconds) for readiness checks

View File

@@ -0,0 +1,55 @@
# ADR-0015: Error Bus Implementation
## Status
Accepted
## Context
The platform needs a centralized error handling mechanism for:
- Capturing panics and errors
- Logging errors consistently
- Sending errors to external services (Sentry, etc.)
- Avoiding error handling duplication
Options considered:
1. **Channel-based in-process bus** - Simple, Go-idiomatic
2. **Event bus integration** - Use existing event bus
3. **Direct logging** - No bus, direct integration
4. **External service integration** - Direct to Sentry
## Decision
Implement a **channel-based error bus** with pluggable sinks:
1. **Error bus interface**: `type ErrorPublisher interface { Publish(err error) }`
2. **Channel-based implementation**: Background goroutine consumes errors from channel
3. **Pluggable sinks**: Logger (always), Sentry (optional, Phase 6)
4. **Panic recovery middleware**: Automatically publishes panics to error bus
**Rationale:**
- Simple, idiomatic Go pattern
- Non-blocking error publishing (buffered channel)
- Decouples error capture from error handling
- Easy to add new sinks (Sentry, logging, metrics)
- Can be extended to use event bus later if needed
## Consequences
### Positive
- Centralized error handling
- Non-blocking (doesn't slow down request path)
- Easy to extend with new sinks
- Consistent error handling across the platform
### Negative
- Additional goroutine overhead (minimal)
- Must ensure error bus doesn't become bottleneck
### Implementation Notes
- Create `pkg/errorbus/errorbus.go` interface
- Implement `internal/errorbus/channel_bus.go`:
- Buffered channel (e.g., size 100)
- Background goroutine consumes errors
- Multiple sinks (logger, optional Sentry)
- Add panic recovery middleware that publishes to bus
- Register in DI container as singleton
- Monitor channel size to detect error storms

View File

@@ -0,0 +1,56 @@
# ADR-0016: OpenTelemetry Observability Strategy
## Status
Accepted
## Context
The platform needs distributed tracing and observability for:
- Request tracing across services/modules
- Performance monitoring
- Debugging production issues
- Integration with observability tools (Jaeger, Grafana, etc.)
Options considered:
1. **OpenTelemetry** - Industry standard, vendor-neutral
2. **Zipkin** - Older standard, less ecosystem support
3. **Custom tracing** - Build our own
4. **No tracing** - Only logs and metrics
## Decision
Use **OpenTelemetry (OTEL)** for all observability:
1. **Tracing**: Distributed tracing with spans
2. **Metrics**: Prometheus-compatible metrics
3. **Logs**: Structured logs with trace correlation
4. **Export**: OTLP collector for production, stdout for development
**Rationale:**
- Industry standard, vendor-neutral
- Excellent Go SDK support
- Integrates with major observability tools
- Supports metrics, traces, and logs
- Recommended in playbook-golang.md
- Future-proof (not locked to specific vendor)
## Consequences
### Positive
- Vendor-neutral (can switch backends)
- Rich ecosystem and tooling
- Excellent Go SDK
- Supports all observability signals
### Negative
- Learning curve for OpenTelemetry concepts
- Slight overhead (minimal with sampling)
- Requires OTLP collector or compatible backend
### Implementation Notes
- Install: `go.opentelemetry.io/otel` and contrib packages
- Initialize TracerProvider in `internal/observability/tracer.go`
- Use HTTP instrumentation middleware: `otelhttp.NewHandler()`
- Add database instrumentation via Ent interceptor
- Export to stdout for development, OTLP for production
- Include trace ID in structured logs
- Configure sampling for production (e.g., 10% or adaptive)

View File

@@ -0,0 +1,55 @@
# ADR-0017: JWT Token Strategy
## Status
Accepted
## Context
The platform needs authentication tokens that:
- Are stateless (no server-side session storage)
- Support role and permission claims
- Can be revoked (challenge)
- Have appropriate lifetimes
- Support multi-tenancy (tenant ID in claims)
Token strategies considered:
1. **Short-lived access tokens + long-lived refresh tokens** - Industry standard
2. **Single long-lived tokens** - Simple but insecure
3. **Short-lived tokens only** - Secure but poor UX
4. **Session-based** - Stateful, requires storage
## Decision
Use **short-lived access tokens + long-lived refresh tokens**:
1. **Access tokens**: 15 minutes lifetime, contain user ID, roles, tenant ID
2. **Refresh tokens**: 7 days lifetime, stored in database (for revocation)
3. **Token format**: JWT with claims: `sub` (user ID), `roles`, `tenant_id`, `exp`
4. **Revocation**: Refresh tokens stored in DB, can be revoked/deleted
**Rationale:**
- Industry best practice (OAuth2/OIDC pattern)
- Good balance of security and UX
- Access tokens can't be revoked (short lifetime mitigates risk)
- Refresh tokens can be revoked (stored in DB)
- Supports stateless authentication for most requests
## Consequences
### Positive
- Secure (short access token lifetime)
- Good UX (refresh tokens prevent frequent re-login)
- Stateless for most requests (access tokens)
- Supports revocation (refresh tokens)
### Negative
- Requires refresh token storage (DB table)
- More complex than single token
- Need to handle token refresh flow
### Implementation Notes
- Use `github.com/golang-jwt/jwt/v5` for JWT handling
- Store refresh tokens in `refresh_tokens` table (user_id, token_hash, expires_at)
- Generate access tokens with HS256 or RS256 signing
- Include roles in token claims (not just role IDs)
- Validate token signature and expiration on each request
- Refresh endpoint validates refresh token and issues new access token

View File

@@ -0,0 +1,53 @@
# ADR-0018: Password Hashing Algorithm
## Status
Accepted
## Context
The platform needs to securely store user passwords. Requirements:
- Resist brute-force attacks
- Resist rainbow table attacks
- Future-proof against advances in computing
- Reasonable performance (not too slow)
Options considered:
1. **bcrypt** - Battle-tested, widely used
2. **argon2id** - Modern, memory-hard, recommended by OWASP
3. **scrypt** - Memory-hard, good alternative
4. **PBKDF2** - Older standard, less secure
## Decision
Use **argon2id** for password hashing with recommended parameters:
- **Algorithm**: argon2id (variant)
- **Memory**: 64 MB (65536 KB)
- **Iterations**: 3 (time cost)
- **Parallelism**: 4 (number of threads)
- **Salt length**: 16 bytes (random, unique per password)
**Rationale:**
- Recommended by OWASP for new applications
- Memory-hard algorithm (resistant to GPU/ASIC attacks)
- Good balance of security and performance
- Future-proof design
- Standard library support in Go 1.23+
## Consequences
### Positive
- Strong security guarantees
- Memory-hard (resistant to hardware attacks)
- OWASP recommended
- Standard library support
### Negative
- Slightly slower than bcrypt (acceptable trade-off)
- Requires tuning parameters for production
### Implementation Notes
- Use `golang.org/x/crypto/argon2` package
- Store hash in format: `$argon2id$v=19$m=65536,t=3,p=4$salt$hash`
- Use `crypto/rand` for salt generation
- Verify passwords with `argon2.CompareHashAndPassword()`
- Consider increasing parameters for high-security environments

View File

@@ -0,0 +1,57 @@
# ADR-0019: Permission DSL Format
## Status
Accepted
## Context
The platform needs a permission system that:
- Is extensible by modules
- Prevents typos and errors (compile-time safety)
- Supports hierarchical permissions
- Is easy to understand and use
Permission formats considered:
1. **String format**: `"module.resource.action"` - Simple, flexible
2. **Enum/Constants**: Type-safe but less flexible
3. **Hierarchical tree**: Complex but powerful
4. **Bitmask**: Efficient but hard to read
## Decision
Use **string-based permission format** with **code-generated constants**:
1. **Format**: `"{module}.{resource}.{action}"`
- Examples: `blog.post.create`, `user.read`, `system.health.check`
2. **Code generation**: Generate constants from `module.yaml` files
3. **Type safety**: `type Permission string` with generated constants
4. **Validation**: Compile-time constants prevent typos
**Rationale:**
- Simple and readable
- Easy to extend (modules define in manifest)
- Code generation provides compile-time safety
- Flexible (modules can define any format)
- Hierarchical structure is intuitive
- Easy to parse and match
## Consequences
### Positive
- Simple and intuitive format
- Compile-time safety via code generation
- Easy to extend by modules
- Human-readable
- Flexible for various permission models
### Negative
- String comparisons (minimal performance impact)
- Requires code generation step
- Potential for permission string conflicts (mitigated by module prefix)
### Implementation Notes
- Define `type Permission string` in `pkg/perm/perm.go`
- Create code generator: `scripts/generate-permissions.go`
- Scan `modules/*/module.yaml` for permissions
- Generate constants in `pkg/perm/generated.go`
- Use `//go:generate` directive
- Validate format: `^[a-z0-9]+(\.[a-z0-9]+)*$` (lowercase, dots)

View File

@@ -0,0 +1,63 @@
# ADR-0020: Audit Logging Storage
## Status
Accepted
## Context
The platform needs to audit all security-relevant actions:
- User logins and authentication attempts
- Permission changes
- Data modifications
- Administrative actions
Audit logs must be:
- Immutable (append-only)
- Queryable
- Performant (don't slow down operations)
- Compliant with audit requirements
Storage options considered:
1. **PostgreSQL table** - Simple, queryable, transactional
2. **Elasticsearch** - Excellent for searching, but additional dependency
3. **File-based logs** - Simple but hard to query
4. **External audit service** - Overkill for initial version
## Decision
Store audit logs in **PostgreSQL append-only table** with JSON metadata:
1. **Table structure**: `audit_logs` with columns:
- `id`, `actor_id`, `action`, `target_id`, `metadata` (JSONB), `timestamp`
2. **Append-only**: No UPDATE or DELETE operations
3. **JSON metadata**: Flexible storage for additional context
4. **Indexing**: Index on `actor_id`, `action`, `timestamp` for queries
**Rationale:**
- Simple (no additional infrastructure)
- Queryable via SQL
- Transactional (consistent with other data)
- JSONB provides flexibility for metadata
- Can migrate to Elasticsearch later if needed
- Good performance for typical audit volumes
## Consequences
### Positive
- Simple implementation
- Queryable via SQL
- No additional infrastructure
- Transactional consistency
- Can archive old logs if needed
### Negative
- Adds load to primary database
- May need archiving strategy for large volumes
- Less powerful search than Elasticsearch
### Implementation Notes
- Create `audit_logs` table via Ent schema
- Use JSONB for metadata column (PostgreSQL-specific)
- Add indexes: `(actor_id, timestamp)`, `(action, timestamp)`
- Implement async logging (optional, via channel) for high throughput
- Consider partitioning by date for large volumes
- Add retention policy (e.g., archive after 1 year)

View File

@@ -0,0 +1,54 @@
# ADR-0021: Module Loading Strategy
## Status
Accepted
## Context
The platform needs to support pluggable modules. Two approaches:
1. **Static registration** - Modules compiled into binary
2. **Dynamic plugin loading** - Load `.so` files at runtime
Each has trade-offs for development, CI, and production.
## Decision
Support **both approaches** with **static registration as primary**:
1. **Static registration (primary)**:
- Modules register via `init()` function
- Imported via `import _ "module/pkg"` in main
- Works everywhere (Windows, Linux, macOS)
- Compile-time type safety
2. **Dynamic plugin loading (optional)**:
- Support via Go `plugin` package
- Load `.so` files from `./plugins/` directory
- Only for production scenarios requiring hot-swap
- Linux/macOS only (Go plugin limitation)
**Rationale:**
- Static registration is simpler and more reliable
- Works in CI/CD (no plugin compilation needed)
- Compile-time safety catches errors early
- Dynamic loading provides flexibility for specific use cases
- Modules can choose their approach
## Consequences
### Positive
- Flexible: static for most cases, dynamic when needed
- Static registration works everywhere
- Compile-time safety with static
- Hot-swap capability with dynamic (Linux/macOS)
### Negative
- Two code paths to maintain
- Dynamic plugins have version compatibility constraints
- Plugin debugging is harder
### Implementation Notes
- Implement static registry in `internal/registry/registry.go`
- Modules register via: `registry.Register(Module)` in `init()`
- Implement plugin loader in `internal/pluginloader/plugin_loader.go` (optional)
- Document when to use each approach
- Validate plugin version compatibility if using dynamic loading

View File

@@ -0,0 +1,56 @@
# ADR-0022: Cache Implementation
## Status
Accepted
## Context
The platform needs caching for:
- Performance optimization (reduce database load)
- Frequently accessed data (user permissions, roles)
- Session data (optional)
- Query results
Options considered:
1. **Redis** - Industry standard, feature-rich
2. **In-memory cache** - Simple, no external dependency
3. **Memcached** - Simple, but less features than Redis
4. **No cache** - Simplest, but poor performance at scale
## Decision
Use **Redis** as the primary cache with **in-memory fallback**:
1. **Primary**: Redis for production
2. **Fallback**: In-memory cache for development/testing
3. **Interface abstraction**: `Cache` interface allows swapping implementations
4. **Use cases**: Permission lookups, role assignments, query caching
**Rationale:**
- Industry standard, widely supported
- Rich feature set (TTL, pub/sub, etc.)
- Can be shared across instances (multi-instance deployments)
- Good performance
- Easy to abstract behind interface
## Consequences
### Positive
- High performance
- Shared across instances
- Rich feature set
- Easy to scale horizontally
- Abstraction allows swapping implementations
### Negative
- Additional infrastructure dependency
- Network latency (minimal with proper setup)
- Need to handle Redis failures gracefully
### Implementation Notes
- Install: `github.com/redis/go-redis/v9`
- Create `pkg/infra/cache/cache.go` interface
- Implement `internal/infra/cache/redis_cache.go`
- Implement `internal/infra/cache/memory_cache.go` for fallback
- Use connection pooling
- Handle Redis failures gracefully (fallback or error)
- Configure TTLs appropriately (e.g., 5 minutes for permissions)

View File

@@ -0,0 +1,59 @@
# ADR-0023: Event Bus Implementation
## Status
Accepted
## Context
The platform needs an event bus for:
- Module-to-module communication
- Decoupled event publishing
- Event sourcing (optional, future)
- Integration with external systems
Options considered:
1. **In-process channel-based bus** - Simple, for development/testing
2. **Kafka** - Production-grade, scalable
3. **RabbitMQ** - Alternative message broker
4. **Redis pub/sub** - Simple but less reliable
## Decision
Support **dual implementation** with **in-process primary, Kafka for production**:
1. **In-process bus (default)**:
- Channel-based implementation
- Used for development, testing, small deployments
- Simple, no external dependencies
2. **Kafka bus (production)**:
- Full Kafka integration via `segmentio/kafka-go`
- Producer/consumer groups
- Configurable via environment (switch implementation)
**Rationale:**
- In-process bus is simple for development
- Kafka provides production-grade reliability and scalability
- Interface abstraction allows swapping
- Modules don't need to know which implementation
- Can start simple and scale up
## Consequences
### Positive
- Simple for development (no Kafka needed)
- Scalable for production (Kafka)
- Flexible (can choose implementation)
- Modules are decoupled from implementation
### Negative
- Two implementations to maintain
- Need to ensure interface covers both use cases
- Kafka adds infrastructure complexity
### Implementation Notes
- Create `pkg/eventbus/eventbus.go` interface
- Implement `internal/infra/bus/inprocess_bus.go` (channel-based)
- Implement `internal/infra/bus/kafka_bus.go` (Kafka)
- Select implementation via config
- Support both sync and async event publishing
- Handle errors gracefully (retry, dead letter queue)

View File

@@ -0,0 +1,56 @@
# ADR-0024: Background Job Scheduler
## Status
Accepted
## Context
The platform needs background job processing for:
- Periodic tasks (cron jobs)
- Asynchronous processing
- Long-running operations
- Retry logic for failed jobs
Options considered:
1. **asynq (Redis-based)** - Simple, feature-rich
2. **cron + custom queue** - Build our own
3. **Kafka consumers** - Use event bus
4. **External service** - AWS SQS, etc.
## Decision
Use **asynq** (Redis-backed) for job scheduling:
1. **Cron jobs**: `github.com/robfig/cron/v3` for periodic tasks
2. **Job queue**: `github.com/hibiken/asynq` for async jobs
3. **Storage**: Redis (shared with cache)
4. **Features**: Retries, backoff, job status tracking
**Rationale:**
- Simple, Redis-backed (no new infrastructure)
- Good Go library support
- Built-in retry and backoff
- Job status tracking
- Easy to integrate
- Can scale horizontally (multiple workers)
## Consequences
### Positive
- Simple (uses existing Redis)
- Feature-rich (retries, backoff)
- Good performance
- Easy to scale
- Job status tracking
### Negative
- Tied to Redis (but we're already using it)
- Requires Redis to be available
### Implementation Notes
- Install: `github.com/hibiken/asynq` and `github.com/robfig/cron/v3`
- Create `pkg/scheduler/scheduler.go` interface
- Implement `internal/infra/scheduler/asynq_scheduler.go`
- Register jobs in `internal/infra/scheduler/job_registry.go`
- Start worker in fx lifecycle
- Configure retry policies (exponential backoff)
- Add job monitoring endpoint

View File

@@ -0,0 +1,49 @@
# ADR-0025: Multi-tenancy Model
## Status
Accepted
## Context
The platform may need multi-tenancy support for SaaS deployments. Options:
1. **Shared database with tenant_id column** - Single DB, row-level isolation
2. **Schema-per-tenant** - Single DB, separate schemas
3. **Database-per-tenant** - Separate databases
Each has trade-offs for isolation, performance, and operational complexity.
## Decision
Use **shared database with tenant_id column** (optional feature):
1. **Model**: Single PostgreSQL database with `tenant_id` column on tenant-scoped tables
2. **Isolation**: Row-level via Ent interceptors (automatic filtering)
3. **Tenant resolution**: From header (`X-Tenant-ID`), subdomain, or JWT claim
4. **Optional**: Can be disabled for single-tenant deployments
**Rationale:**
- Simplest operational model (single database)
- Good performance (can index tenant_id)
- Easy to implement (Ent interceptors)
- Can migrate to schema-per-tenant later if needed
- Flexible (can support both single and multi-tenant)
## Consequences
### Positive
- Simple operations (single database)
- Good performance with proper indexing
- Easy to implement
- Flexible (optional feature)
### Negative
- Requires careful query design (ensure tenant_id filtering)
- Data isolation at application level (not database level)
- Potential for data leakage if bugs occur
### Implementation Notes
- Make tenant_id optional (nullable) for single-tenant mode
- Add Ent interceptor to automatically filter by tenant_id
- Resolve tenant from context via middleware
- Add tenant_id to JWT claims
- Document tenant isolation guarantees
- Consider adding tenant_id to all tenant-scoped tables

View File

@@ -0,0 +1,54 @@
# ADR-0026: Error Reporting Service
## Status
Accepted
## Context
The platform needs error reporting for:
- Production error tracking
- Stack trace collection
- Error aggregation and analysis
- Integration with monitoring
Options considered:
1. **Sentry** - Popular, feature-rich
2. **Rollbar** - Alternative error tracking
3. **Custom solution** - Build our own
4. **Logs only** - No external service
## Decision
Use **Sentry** for error reporting (optional, configurable):
1. **Integration**: Via error bus sink
2. **Configuration**: Sentry DSN from config
3. **Context**: Include user ID, trace ID, module name
4. **Optional**: Can be disabled for development
**Rationale:**
- Industry standard error tracking
- Excellent Go SDK
- Rich features (release tracking, grouping, etc.)
- Good free tier
- Easy to integrate
## Consequences
### Positive
- Excellent error tracking
- Rich context and grouping
- Easy integration
- Good free tier
### Negative
- External dependency
- Additional cost at scale
- Privacy considerations (data sent to Sentry)
### Implementation Notes
- Install: `github.com/getsentry/sentry-go`
- Create Sentry sink for error bus
- Configure via environment variable
- Include context: user ID, trace ID, module name
- Set up release tracking
- Configure sampling for high-volume deployments

View File

@@ -0,0 +1,54 @@
# ADR-0027: Rate Limiting Strategy
## Status
Accepted
## Context
The platform needs rate limiting to:
- Prevent abuse and DoS attacks
- Protect against brute-force attacks
- Ensure fair resource usage
- Comply with API usage policies
Rate limiting strategies:
1. **Per-user rate limiting** - Based on authenticated user
2. **Per-IP rate limiting** - Based on client IP
3. **Fixed rate limiting** - Global limits
4. **Distributed rate limiting** - Shared state across instances
## Decision
Implement **multi-level rate limiting**:
1. **Per-user rate limiting**: For authenticated requests (e.g., 100 req/min)
2. **Per-IP rate limiting**: For all requests (e.g., 1000 req/min)
3. **Storage**: Redis for distributed rate limiting
4. **Algorithm**: Token bucket or sliding window
**Rationale:**
- Multi-level provides defense in depth
- Per-user prevents abuse by authenticated users
- Per-IP protects against unauthenticated abuse
- Redis enables distributed rate limiting (multi-instance)
- Token bucket provides smooth rate limiting
## Consequences
### Positive
- Multi-layer protection
- Works with multiple instances
- Configurable per endpoint
- Standard approach
### Negative
- Requires Redis (or shared state)
- Additional latency (minimal)
- Need to handle Redis failures gracefully
### Implementation Notes
- Use `github.com/ulule/limiter/v3` library
- Configure limits in config file
- Store rate limit state in Redis
- Return `X-RateLimit-*` headers
- Handle Redis failures gracefully (fail open or closed based on config)
- Configure different limits for different endpoints

View File

@@ -0,0 +1,67 @@
# ADR-0028: Testing Strategy
## Status
Accepted
## Context
The platform needs a comprehensive testing strategy:
- Unit tests for individual components
- Integration tests for full flows
- Contract tests for API compatibility
- Load tests for performance
Testing tools and approaches vary in complexity and coverage.
## Decision
Adopt a **multi-layered testing approach**:
1. **Unit tests**:
- Tool: Standard `testing` package + `testify`
- Coverage: >80% for core modules
- Mocks: `mockery` or `mockgen`
- Fast execution (< 1 second)
2. **Integration tests**:
- Tool: `testcontainers-go` for Docker-based services
- Coverage: End-to-end flows (auth, modules, etc.)
- Infrastructure: PostgreSQL, Redis, Kafka via testcontainers
- Tagged: `//go:build integration`
3. **Contract tests**:
- Tool: OpenAPI validator (`kin-openapi`)
- Coverage: API request/response validation
- Optional: Pact for service contracts
4. **Load tests**:
- Tool: k6 or vegeta
- Coverage: Critical endpoints (auth, API)
- Performance benchmarks
**Rationale:**
- Comprehensive coverage across layers
- Fast feedback with unit tests
- Realistic testing with integration tests
- API compatibility with contract tests
- Performance validation with load tests
## Consequences
### Positive
- High confidence in code quality
- Fast unit tests for quick feedback
- Realistic integration tests
- API compatibility guaranteed
### Negative
- Integration tests are slower
- Requires Docker for testcontainers
- More complex CI setup
### Implementation Notes
- Use `testify` for assertions: `require` and `assert`
- Generate mocks with `mockery` or `mockgen`
- Create test helpers in `internal/testutil/`
- Use test tags: `go test -tags=integration ./...`
- Run integration tests in separate CI job
- Document testing approach in `CONTRIBUTING.md`

View File

@@ -0,0 +1,86 @@
# Architecture Decision Records (ADRs)
This directory contains Architecture Decision Records (ADRs) for the Go Platform project.
## What are ADRs?
ADRs document important architectural decisions made during the project. They help:
- Track why decisions were made
- Understand the context and constraints
- Review decisions when requirements change
- Onboard new team members
## ADR Format
Each ADR follows this structure:
- **Status**: Proposed | Accepted | Rejected | Superseded
- **Context**: The situation that led to the decision
- **Decision**: What was decided
- **Consequences**: Positive and negative impacts
## ADR Index
### Phase 0: Project Setup & Foundation
- [ADR-0001: Go Module Path](./0001-go-module-path.md) - Module path: `git.dcentral.systems/toolz/goplt`
- [ADR-0002: Go Version](./0002-go-version.md) - Go 1.24.3
- [ADR-0003: Dependency Injection Framework](./0003-dependency-injection-framework.md) - uber-go/fx
- [ADR-0004: Configuration Management](./0004-configuration-management.md) - spf13/viper + cobra
- [ADR-0005: Logging Framework](./0005-logging-framework.md) - go.uber.org/zap
- [ADR-0006: HTTP Framework](./0006-http-framework.md) - gin-gonic/gin
- [ADR-0007: Project Directory Structure](./0007-project-directory-structure.md) - Standard Go layout with internal/pkg separation
- [ADR-0008: Error Handling Strategy](./0008-error-handling-strategy.md) - Wrapped errors with typed errors
- [ADR-0009: Context Key Types](./0009-context-key-types.md) - Typed context keys
- [ADR-0010: CI/CD Platform](./0010-ci-cd-platform.md) - GitHub Actions
- [ADR-0011: Code Generation Tools](./0011-code-generation-tools.md) - go generate workflow
- [ADR-0012: Logger Interface Design](./0012-logger-interface-design.md) - Logger interface abstraction
### Phase 1: Core Kernel & Infrastructure
- [ADR-0013: Database ORM Selection](./0013-database-orm.md) - entgo.io/ent
- [ADR-0014: Health Check Implementation](./0014-health-check-implementation.md) - Custom health check registry
- [ADR-0015: Error Bus Implementation](./0015-error-bus-implementation.md) - Channel-based error bus with pluggable sinks
- [ADR-0016: OpenTelemetry Observability Strategy](./0016-opentelemetry-observability.md) - OpenTelemetry for tracing, metrics, logs
### Phase 2: Authentication & Authorization
- [ADR-0017: JWT Token Strategy](./0017-jwt-token-strategy.md) - Short-lived access tokens + long-lived refresh tokens
- [ADR-0018: Password Hashing Algorithm](./0018-password-hashing.md) - argon2id
- [ADR-0019: Permission DSL Format](./0019-permission-dsl-format.md) - String-based format with code generation
- [ADR-0020: Audit Logging Storage](./0020-audit-logging-storage.md) - PostgreSQL append-only table with JSONB metadata
### Phase 3: Module Framework
- [ADR-0021: Module Loading Strategy](./0021-module-loading-strategy.md) - Static registration (primary) + dynamic plugin loading (optional)
### Phase 5: Infrastructure Adapters
- [ADR-0022: Cache Implementation](./0022-cache-implementation.md) - Redis with in-memory fallback
- [ADR-0023: Event Bus Implementation](./0023-event-bus-implementation.md) - In-process bus (default) + Kafka (production)
- [ADR-0024: Background Job Scheduler](./0024-job-scheduler.md) - asynq (Redis-backed) + cron
- [ADR-0025: Multi-tenancy Model](./0025-multitenancy-model.md) - Shared database with tenant_id column (optional)
### Phase 6: Observability & Production Readiness
- [ADR-0026: Error Reporting Service](./0026-error-reporting-service.md) - Sentry (optional, configurable)
- [ADR-0027: Rate Limiting Strategy](./0027-rate-limiting-strategy.md) - Multi-level (per-user + per-IP) with Redis
### Phase 7: Testing, Documentation & CI/CD
- [ADR-0028: Testing Strategy](./0028-testing-strategy.md) - Multi-layered (unit, integration, contract, load)
## Adding New ADRs
When making a new architectural decision:
1. Create a new file: `XXXX-short-title.md` (next sequential number)
2. Follow the ADR template
3. Update this README with the new entry
4. Set status to "Proposed" initially
5. Update to "Accepted" after review/approval
## References
- [ADR Template](https://adr.github.io/madr/)
- [Documenting Architecture Decisions](https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions)