- Add comprehensive 8-phase implementation plan (docs/plan.md) - Add 28 Architecture Decision Records (docs/adr/) covering all phases - Add task tracking system with 283+ task files (docs/stories/) - Add task generator script for automated task file creation - Add reference playbooks and requirements documentation This commit establishes the complete planning foundation for the Go Platform implementation, documenting all architectural decisions and providing detailed task breakdown for Phases 0-8.
45 KiB
45 KiB
Go Platform Implementation Plan
"Plug‑in‑friendly SaaS/Enterprise Platform – Go Edition"
This document outlines a complete, phased implementation plan for building the Go platform boilerplate based on the requirements from
playbook.mdandplaybook-golang.md.
Executive Summary
This plan breaks down the implementation into 8 phases, each with specific deliverables and acceptance criteria. The approach prioritizes building a solid foundation (core kernel) before adding feature modules and advanced capabilities.
Total Estimated Timeline: 8-12 weeks (depending on team size and parallelization)
Key Principles:
- Clean/Hexagonal Architecture with clear separation between
pkg/(interfaces) andinternal/(implementations) - Dependency Injection using
uber-go/fxfor lifecycle management - Modular Monolith design that can evolve into microservices
- Plugin-first architecture supporting both static and dynamic module loading
- Security-by-Design with JWT auth, RBAC/ABAC, and audit logging
- Observability via OpenTelemetry, Prometheus, and structured logging
Phase 0: Project Setup & Foundation (Week 1)
Objectives
- Initialize repository structure
- Set up Go modules and basic tooling
- Create configuration management foundation
- Establish CI/CD skeleton
Tasks
0.1 Repository Bootstrap
- Initialize Go module:
go mod init github.com/yourorg/platform - Create directory structure:
platform/ ├── cmd/ │ └── platform/ # Main entry point ├── internal/ # Private implementation code │ ├── di/ # Dependency injection container │ ├── registry/ # Module registry │ ├── pluginloader/ # Plugin loader (optional) │ └── infra/ # Infrastructure adapters ├── pkg/ # Public interfaces (exported) │ ├── config/ # ConfigProvider interface │ ├── logger/ # Logger interface │ ├── module/ # IModule interface │ ├── auth/ # Auth interfaces │ ├── perm/ # Permission DSL │ └── infra/ # Infrastructure interfaces ├── modules/ # Feature modules │ └── blog/ # Sample Blog module (Phase 4) ├── config/ # Configuration files │ ├── default.yaml │ ├── development.yaml │ └── production.yaml ├── api/ # OpenAPI specs ├── scripts/ # Build/test scripts ├── docs/ # Documentation ├── ops/ # Operations (Grafana dashboards, etc.) ├── .github/ │ └── workflows/ │ └── ci.yml ├── Dockerfile ├── docker-compose.yml ├── docker-compose.test.yml └── go.mod - Add
.gitignorefor Go projects - Create initial
README.mdwith project overview
0.2 Configuration System
- Install
github.com/spf13/viperandgithub.com/spf13/cobra - Create
pkg/config/config.gointerface:type ConfigProvider interface { Get(key string) any Unmarshal(v any) error GetString(key string) string GetInt(key string) int GetBool(key string) bool } - Implement
internal/config/config.gousing Viper:- Load
config/default.yamlas baseline - Merge environment-specific YAML (development/production)
- Apply environment variable overrides
- Support secret manager integration (placeholder for Phase 6)
- Load
- Create
config/default.yamlwith basic structure:environment: development server: port: 8080 host: "0.0.0.0" database: driver: "postgres" dsn: "" logging: level: "info" format: "json" - Add
internal/config/loader.gowithLoadConfig()function
0.3 Logging Foundation
- Install
go.uber.org/zap - Create
pkg/logger/logger.gointerface:type Logger interface { Debug(msg string, fields ...Field) Info(msg string, fields ...Field) Warn(msg string, fields ...Field) Error(msg string, fields ...Field) With(fields ...Field) Logger } - Implement
internal/logger/zap_logger.go:- Structured JSON logging
- Configurable log levels
- Request-scoped fields support
- Export global logger via
pkg/logger
- Add request ID middleware helper (Gin middleware)
0.4 Basic CI/CD Pipeline
- Create
.github/workflows/ci.yml:- Go 1.22 setup
- Module caching
- Linting (golangci-lint or staticcheck)
- Unit tests (basic skeleton)
- Build binary
- Add
Makefilewith common commands:make test- run testsmake lint- run lintermake build- build binarymake docker-build- build Docker image
0.5 Dependency Injection Setup
- Install
go.uber.org/fx - Create
internal/di/container.go:- Initialize fx container
- Register Config and Logger providers
- Basic lifecycle hooks
- Create
cmd/platform/main.goskeleton:- Load config
- Initialize DI container
- Start minimal HTTP server (placeholder)
Deliverables
- ✅ Repository structure in place
- ✅ Configuration system loads YAML files and env vars
- ✅ Structured logging works
- ✅ CI pipeline runs linting and builds binary
- ✅ Basic DI container initialized
Acceptance Criteria
go build ./cmd/platformsucceedsgo test ./...runs (even if tests are empty)- CI pipeline passes on empty commit
- Config loads from
config/default.yaml
Phase 1: Core Kernel & Infrastructure (Week 2-3)
Objectives
- Implement dependency injection container
- Set up database (Ent ORM)
- Create health and metrics endpoints
- Implement error bus
- Add basic HTTP server with middleware
Tasks
1.1 Dependency Injection Container
- Extend
internal/di/container.go:- Register all core services
- Provide lifecycle management via fx
- Support service overrides
- Create
internal/di/providers.go:ProvideConfig()- config providerProvideLogger()- loggerProvideDatabase()- Ent client (after 1.2)ProvideHealthCheckers()- health check registryProvideMetrics()- Prometheus registryProvideErrorBus()- error bus
- Add
internal/di/core_module.go:- Export
CoreModulefx.Option that provides all core services
- Export
1.2 Database Setup (Ent)
- Install
entgo.io/ent/cmd/ent - Initialize Ent schema:
go run entgo.io/ent/cmd/ent init User Role Permission AuditLog - Define core entities in
internal/ent/schema/:user.go: ID, email, password_hash, verified, created_at, updated_atrole.go: ID, name, description, created_atpermission.go: ID, name (string format: "module.resource.action")audit_log.go: ID, actor_id, action, target_id, metadata (JSON), timestamprole_permissions.go: Many-to-many relationshipuser_roles.go: Many-to-many relationship
- Generate Ent code:
go generate ./internal/ent - Create
internal/infra/database/client.go:NewEntClient(dsn string) (*ent.Client, error)- Connection pooling configuration
- Migration runner wrapper
- Add database config to
config/default.yaml
1.3 Health & Metrics
- Install
github.com/prometheus/client_golang/prometheus - Install
github.com/heptiolabs/healthcheck(optional, or custom) - Create
pkg/health/health.gointerface:type HealthChecker interface { Check(ctx context.Context) error } - Implement
internal/health/registry.go:- Registry of health checkers
/healthzendpoint (liveness)/readyendpoint (readiness with DB check)
- Create
internal/metrics/metrics.go:- HTTP request duration histogram
- HTTP request counter
- Database query duration (via Ent interceptor)
- Error counter
- Add
/metricsendpoint (Prometheus format) - Register endpoints in main HTTP router
1.4 Error Bus
- Create
pkg/errorbus/errorbus.gointerface:type ErrorPublisher interface { Publish(err error) } - Implement
internal/errorbus/channel_bus.go:- Channel-based error bus
- Background goroutine consumes errors
- Log all errors
- Optional: Sentry integration (Phase 6)
- Add panic recovery middleware that publishes to error bus
- Register error bus in DI container
1.5 HTTP Server Foundation
- Install
github.com/gin-gonic/gin - Create
internal/server/server.go:- Initialize Gin router
- Add middleware:
- Request ID generator
- Structured logging
- Panic recovery → error bus
- Prometheus metrics
- CORS (configurable)
- Register core routes:
GET /healthzGET /readyGET /metrics
- Wire HTTP server into fx lifecycle:
- Start on
OnStart - Graceful shutdown on
OnStop
- Start on
- Update
cmd/platform/main.goto use fx lifecycle
1.6 OpenTelemetry Setup
- Install OpenTelemetry packages:
go.opentelemetry.io/otelgo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp
- Create
internal/observability/tracer.go:- Initialize OTEL TracerProvider
- Export to stdout (development) or OTLP (production)
- Add HTTP instrumentation middleware
- Add trace context propagation to requests
Deliverables
- ✅ DI container with all core services
- ✅ Database client with Ent schema
- ✅ Health and metrics endpoints functional
- ✅ Error bus captures and logs errors
- ✅ HTTP server with middleware stack
- ✅ Basic observability with OpenTelemetry
Acceptance Criteria
GET /healthzreturns 200GET /readychecks DB connectivityGET /metricsexposes Prometheus metrics- Panic recovery logs errors via error bus
- Database migrations run on startup
- HTTP requests are traced with OpenTelemetry
Phase 2: Authentication & Authorization (Week 3-4)
Objectives
- Implement JWT authentication
- Create identity management (User CRUD)
- Build role and permission system
- Add authorization middleware
- Implement audit logging
Tasks
2.1 Authentication (JWT)
- Install
github.com/golang-jwt/jwt/v5 - Create
pkg/auth/auth.gointerfaces:type Authenticator interface { GenerateToken(userID string, roles []string, tenantID string) (string, error) VerifyToken(token string) (*TokenClaims, error) } type TokenClaims struct { UserID string Roles []string TenantID string ExpiresAt time.Time } - Implement
internal/auth/jwt_auth.go:- Generate access tokens (short-lived, 15min)
- Generate refresh tokens (long-lived, 7 days)
- Verify token signature and expiration
- Extract claims
- Create
internal/auth/middleware.go:- Extract JWT from
Authorization: Bearer <token>header - Verify token
- Inject
Userintocontext.Context - Helper:
auth.FromContext(ctx) *User
- Extract JWT from
- Add login endpoint:
POST /api/v1/auth/login- Validate credentials
- Return access + refresh tokens
- Add refresh endpoint:
POST /api/v1/auth/refresh- Validate refresh token
- Issue new access token
2.2 Identity Management
- Create
pkg/identity/identity.gointerfaces:type UserRepository interface { FindByID(ctx context.Context, id string) (*User, error) FindByEmail(ctx context.Context, email string) (*User, error) Create(ctx context.Context, u *User) error Update(ctx context.Context, u *User) error Delete(ctx context.Context, id string) error } type UserService interface { Register(ctx context.Context, email, password string) (*User, error) VerifyEmail(ctx context.Context, token string) error ResetPassword(ctx context.Context, email string) error ChangePassword(ctx context.Context, userID, oldPassword, newPassword string) error } - Implement
internal/identity/user_repo.gousing Ent:- CRUD operations
- Password hashing (bcrypt or argon2)
- Email verification flow
- Implement
internal/identity/user_service.go:- User registration with email verification
- Password reset flow (token-based)
- Password change
- Email verification
- Add endpoints:
POST /api/v1/users- RegisterGET /api/v1/users/:id- Get userPUT /api/v1/users/:id- Update userPOST /api/v1/users/verify-email- Verify emailPOST /api/v1/users/reset-password- Request resetPOST /api/v1/users/change-password- Change password
2.3 Roles & Permissions
- Create
pkg/perm/perm.go:type Permission string // Core permissions var ( SystemHealthCheck Permission = "system.health.check" UserCreate Permission = "user.create" UserRead Permission = "user.read" UserUpdate Permission = "user.update" UserDelete Permission = "user.delete" RoleCreate Permission = "role.create" RoleRead Permission = "role.read" RoleUpdate Permission = "role.update" RoleDelete Permission = "role.delete" ) - Create
pkg/perm/resolver.gointerface:type PermissionResolver interface { HasPermission(ctx context.Context, userID string, perm Permission) (bool, error) GetUserPermissions(ctx context.Context, userID string) ([]Permission, error) } - Implement
internal/perm/in_memory_resolver.go:- Load user roles from DB
- Load role permissions from DB
- Check if user has specific permission
- Cache permission lookups (optional)
- Create
pkg/auth/authz.gointerface:type Authorizer interface { Authorize(ctx context.Context, perm Permission) error } - Implement
internal/auth/rbac_authorizer.go:- Extract user from context
- Check permission via PermissionResolver
- Return error if unauthorized
- Create authorization middleware:
- Decorator pattern:
RequirePermission(perm Permission) gin.HandlerFunc - Use with route registration
- Decorator pattern:
2.4 Role Management API
- Create
internal/identity/role_repo.go:- CRUD for roles
- Assign permissions to roles
- Assign roles to users
- Add endpoints:
POST /api/v1/roles- Create roleGET /api/v1/roles- List rolesGET /api/v1/roles/:id- Get rolePUT /api/v1/roles/:id- Update roleDELETE /api/v1/roles/:id- Delete rolePOST /api/v1/roles/:id/permissions- Assign permissionsPOST /api/v1/users/:id/roles- Assign roles to user
2.5 Audit Logging
- Create
pkg/audit/audit.gointerface:type Auditor interface { Record(ctx context.Context, act AuditAction) error } type AuditAction struct { ActorID string Action string TargetID string Metadata map[string]any } - Implement
internal/audit/ent_auditor.go:- Write to
audit_logtable - Capture actor from context
- Include request ID, IP address, user agent
- Write to
- Add audit middleware:
- Intercept all authenticated requests
- Record action (method + path)
- Store in audit log
- Integrate with auth endpoints:
- Log login attempts (success/failure)
- Log password changes
- Log role assignments
2.6 Seed Data
- Create
internal/seed/seed.go:- Create default admin user (if doesn't exist)
- Create default roles (admin, user, guest)
- Assign permissions to roles
- Script:
go run cmd/seed/main.go
Deliverables
- ✅ JWT authentication with access/refresh tokens
- ✅ User CRUD with email verification
- ✅ Role and permission management
- ✅ Authorization middleware
- ✅ Audit logging for all actions
- ✅ Seed script for initial data
Acceptance Criteria
- User can register and login
- JWT tokens are validated on protected routes
- Users without permission get 403
- All actions are logged in audit table
- Admin can create roles and assign permissions
- Integration test: user without permission cannot access protected resource
Phase 3: Module Framework (Week 4-5)
Objectives
- Define module interface and registration system
- Implement static module registry
- Create permission code generation tool
- Build module loader (support both static and plugin modes)
- Add module discovery and initialization
Tasks
3.1 Module Interface
- Create
pkg/module/module.go:type IModule interface { Name() string Version() string Dependencies() []string Init() fx.Option Migrations() []func(*ent.Client) error } - Create
pkg/module/manifest.go:type Manifest struct { Name string Version string Dependencies []string Permissions []string Routes []Route } - Define
module.yamlschema (used for code generation)
3.2 Static Module Registry
- Create
internal/registry/registry.go:- Thread-safe module map
Register(m IModule)functionAll() []IModulefunctionGet(name string) (IModule, error)function
- Add registration validation:
- Check dependencies are satisfied
- Check for duplicate names
- Validate version compatibility
3.3 Permission Code Generation
- Create
scripts/generate-permissions.go:- Scan all
modules/*/module.yamlfiles - Extract permissions from manifests
- Generate
pkg/perm/generated.go:// Code generated by generate-permissions. DO NOT EDIT. var ( BlogPostCreate Permission = "blog.post.create" BlogPostRead Permission = "blog.post.read" // ... )
- Scan all
- Add
//go:generatedirective topkg/perm/perm.go - Update
Makefilewithmake generatecommand
3.4 Module Loader
- Create
internal/pluginloader/loader.go:- Support static registration (preferred)
- Optional: support Go plugin loading (
.sofiles) - Scan
modules/*/module.yamlfor discovery - Load modules in dependency order
- Implement
internal/pluginloader/static_loader.go:- Import modules via
import _ "github.com/yourorg/blog"(side-effect registration) - Collect all registered modules
- Import modules via
- Implement
internal/pluginloader/plugin_loader.go(optional):- Scan
./plugins/*.so - Load via
plugin.Open() - Extract
Modulesymbol - Validate version compatibility
- Scan
3.5 Module Initialization
- Create
internal/module/initializer.go:- Collect all registered modules
- Resolve dependency order (topological sort)
- Initialize each module's
Init()fx.Option - Merge all options into main fx container
- Run migrations:
- Collect all module migrations
- Run core migrations first
- Run module migrations in dependency order
- Handle migration errors gracefully
3.6 Module Lifecycle Hooks
- Extend
pkg/module/module.go:type IModule interface { // ... existing methods OnStart(ctx context.Context) error // Optional OnStop(ctx context.Context) error // Optional } - Integrate with fx.Lifecycle:
- Call
OnStartduring app startup - Call
OnStopduring graceful shutdown
- Call
3.7 Module CLI Tool
- Create
cmd/platformctl/main.go:platformctl modules list- List all loaded modulesplatformctl modules validate- Validate module dependenciesplatformctl modules test <module>- Test module loading
- Add to
Makefile:make install-cli
Deliverables
- ✅ Module interface and registration system
- ✅ Static module registry working
- ✅ Permission code generation tool
- ✅ Module loader with dependency resolution
- ✅ Module initialization in main app
- ✅ CLI tool for module management
Acceptance Criteria
- Modules can register via
registry.Register() - Permission constants are generated from
module.yaml - Modules load in correct dependency order
- Module migrations run on startup
platformctl modules listshows all modules- Integration test: load multiple modules and verify initialization
Phase 4: Sample Feature Module (Blog) (Week 5-6)
Objectives
- Create a complete sample module (Blog) to demonstrate the framework
- Show how to add routes, permissions, database entities, and services
- Provide reference implementation for future developers
Tasks
4.1 Blog Module Structure
- Create
modules/blog/directory:modules/blog/ ├── go.mod ├── module.yaml ├── internal/ │ ├── api/ │ │ └── handler.go │ ├── domain/ │ │ ├── post.go │ │ └── post_repo.go │ └── service/ │ └── post_service.go └── pkg/ └── module.go - Initialize
go.mod:cd modules/blog go mod init github.com/yourorg/blog
4.2 Module Manifest
- Create
modules/blog/module.yaml:name: blog version: 0.1.0 dependencies: - core >= 1.0.0 permissions: - blog.post.create - blog.post.read - blog.post.update - blog.post.delete routes: - method: POST path: /api/v1/blog/posts permission: blog.post.create - method: GET path: /api/v1/blog/posts/:id permission: blog.post.read - method: PUT path: /api/v1/blog/posts/:id permission: blog.post.update - method: DELETE path: /api/v1/blog/posts/:id permission: blog.post.delete - method: GET path: /api/v1/blog/posts permission: blog.post.read
4.3 Blog Domain Model
- Create
modules/blog/internal/domain/post.go:type Post struct { ID string Title string Content string AuthorID string CreatedAt time.Time UpdatedAt time.Time } - Create Ent schema
modules/blog/internal/ent/schema/post.go:- Fields: title, content, author_id (FK to user)
- Indexes: author_id, created_at
- Generate Ent code for blog module
4.4 Blog Repository
- Create
modules/blog/internal/domain/post_repo.go:type PostRepository interface { Create(ctx context.Context, p *Post) (*Post, error) FindByID(ctx context.Context, id string) (*Post, error) FindByAuthor(ctx context.Context, authorID string) ([]*Post, error) Update(ctx context.Context, p *Post) error Delete(ctx context.Context, id string) error } - Implement using Ent client (shared from core)
4.5 Blog Service
- Create
modules/blog/internal/service/post_service.go:- Business logic for creating/updating posts
- Validation (title length, content requirements)
- Authorization checks (author can only update own posts)
- Integration with audit system
4.6 Blog API Handlers
- Create
modules/blog/internal/api/handler.go:POST /api/v1/blog/posts- Create postGET /api/v1/blog/posts/:id- Get postGET /api/v1/blog/posts- List posts (with pagination)PUT /api/v1/blog/posts/:id- Update postDELETE /api/v1/blog/posts/:id- Delete post
- Use authorization middleware:
grp.Use(auth.RequirePermission(perm.BlogPostCreate)) - Register handlers in module's
Init()
4.7 Blog Module Implementation
- Create
modules/blog/pkg/module.go:type BlogModule struct{} func (b BlogModule) Name() string { return "blog" } func (b BlogModule) Version() string { return "0.1.0" } func (b BlogModule) Dependencies() []string { return nil } func (b BlogModule) Init() fx.Option { return fx.Options( fx.Provide(NewPostRepo), fx.Provide(NewPostService), fx.Invoke(RegisterHandlers), ) } func (b BlogModule) Migrations() []func(*ent.Client) error { return []func(*ent.Client) error{ func(c *ent.Client) error { return c.Schema.Create(context.Background()) }, } } var Module BlogModule func init() { registry.Register(Module) }
4.8 Integration
- Update main
go.modto include blog module:replace github.com/yourorg/blog => ./modules/blog - Import blog module in
cmd/platform/main.go:import _ "github.com/yourorg/blog/pkg" - Run permission generation:
make generate - Verify blog permissions are generated
4.9 Tests
- Create integration test
modules/blog/internal/api/handler_test.go:- Test creating post with valid permission
- Test creating post without permission (403)
- Test updating own post vs other's post
- Test pagination
- Add unit tests for service and repository
Deliverables
- ✅ Complete Blog module with CRUD operations
- ✅ Module registered and loaded by core
- ✅ Permissions generated and used
- ✅ Routes protected with authorization
- ✅ Database migrations run
- ✅ Integration tests passing
Acceptance Criteria
- Blog module loads on platform startup
POST /api/v1/blog/postsrequiresblog.post.createpermission- User can create, read, update, delete posts
- Authorization enforced (users can only edit own posts)
- Integration test: full CRUD flow works
- Audit logs record all blog actions
Phase 5: Infrastructure Adapters (Week 6-7)
Objectives
- Implement infrastructure adapters (cache, queue, blob storage, email)
- Make adapters swappable via interfaces
- Add scheduler/background jobs system
- Implement event bus (in-process and Kafka)
Tasks
5.1 Cache (Redis)
- Install
github.com/redis/go-redis/v9 - Create
pkg/infra/cache/cache.gointerface:type Cache interface { Get(ctx context.Context, key string) ([]byte, error) Set(ctx context.Context, key string, value []byte, ttl time.Duration) error Delete(ctx context.Context, key string) error } - Implement
internal/infra/cache/redis_cache.go - Add Redis config to
config/default.yaml - Register in DI container
- Add cache middleware for selected routes (optional)
5.2 Event Bus
- Create
pkg/eventbus/eventbus.gointerface:type EventBus interface { Publish(ctx context.Context, topic string, event Event) error Subscribe(topic string, handler EventHandler) error } - Implement
internal/infra/bus/inprocess_bus.go:- Channel-based in-process bus
- Used for testing and development
- Implement
internal/infra/bus/kafka_bus.go:- Install
github.com/segmentio/kafka-go - Producer for publishing
- Consumer groups for subscribing
- Error handling and retries
- Install
- Add Kafka config to
config/default.yaml - Register bus in DI container (switchable via config)
- Add core events:
platform.user.createdplatform.user.updatedplatform.role.assignedplatform.permission.granted
5.3 Blob Storage
- Install
github.com/aws/aws-sdk-go-v2/service/s3 - Create
pkg/infra/blob/blob.gointerface:type BlobStore interface { Upload(ctx context.Context, key string, data []byte) error Download(ctx context.Context, key string) ([]byte, error) Delete(ctx context.Context, key string) error GetSignedURL(ctx context.Context, key string, ttl time.Duration) (string, error) } - Implement
internal/infra/blob/s3_store.go - Add S3 config to
config/default.yaml - Register in DI container
- Add file upload endpoint:
POST /api/v1/files/upload
5.4 Email Notification
- Install
github.com/go-mail/mail - Create
pkg/notification/notification.gointerface:type Notifier interface { SendEmail(ctx context.Context, to, subject, body string) error SendSMS(ctx context.Context, to, message string) error } - Implement
internal/infra/email/smtp_notifier.go:- SMTP configuration
- HTML email support
- Templates for common emails (verification, password reset)
- Add email config to
config/default.yaml - Integrate with identity service:
- Send verification email on registration
- Send password reset email
- Register in DI container
5.5 Scheduler & Background Jobs
- Install
github.com/robfig/cron/v3andgithub.com/hibiken/asynq - Create
pkg/scheduler/scheduler.gointerface:type Scheduler interface { Cron(spec string, job JobFunc) error Enqueue(queue string, payload any) error } - Implement
internal/infra/scheduler/asynq_scheduler.go:- Redis-backed job queue
- Cron jobs for periodic tasks
- Job retries and backoff
- Job status tracking
- Create
internal/infra/scheduler/job_registry.go:- Register jobs from modules
- Start job processor on app startup
- Add example jobs:
- Cleanup expired tokens (daily)
- Send digest emails (weekly)
- Add job monitoring endpoint:
GET /api/v1/jobs/status
5.6 Secret Store Integration
- Create
pkg/infra/secret/secret.gointerface:type SecretStore interface { GetSecret(ctx context.Context, key string) (string, error) } - Implement
internal/infra/secret/vault_store.go(HashiCorp Vault):- Install
github.com/hashicorp/vault/api - Support KV v2 secrets
- Install
- Implement
internal/infra/secret/aws_secrets.go(AWS Secrets Manager):- Install
github.com/aws/aws-sdk-go-v2/service/secretsmanager
- Install
- Integrate with config loader:
- Overlay secrets on top of file/env config
- Load secrets lazily (cache)
- Register in DI container (optional, via config)
5.7 Multi-tenancy Support (Optional)
- Create
pkg/tenant/tenant.gointerface:type TenantResolver interface { Resolve(ctx context.Context) (string, error) } - Implement
internal/tenant/resolver.go:- Extract from header:
X-Tenant-ID - Extract from subdomain
- Extract from JWT claim
- Extract from header:
- Add tenant middleware:
- Resolve tenant ID
- Inject into context
- Helper:
tenant.FromContext(ctx) string
- Update Ent queries to filter by tenant_id:
- Add interceptor to Ent client
- Automatically add
WHERE tenant_id = ?to queries
- Update User entity to include tenant_id
Deliverables
- ✅ Cache adapter (Redis) working
- ✅ Event bus (in-process and Kafka) functional
- ✅ Blob storage (S3) adapter
- ✅ Email notification system
- ✅ Scheduler and background jobs
- ✅ Secret store integration (optional)
- ✅ Multi-tenancy support (optional)
Acceptance Criteria
- Cache stores and retrieves data correctly
- Events are published and consumed
- Files can be uploaded and downloaded
- Email notifications are sent
- Background jobs run on schedule
- Integration test: full infrastructure stack works
Phase 6: Observability & Production Readiness (Week 7-8)
Objectives
- Enhance observability with full OpenTelemetry integration
- Add comprehensive error reporting (Sentry)
- Create Grafana dashboards
- Improve logging with request correlation
- Add rate limiting and security hardening
Tasks
6.1 OpenTelemetry Enhancement
- Complete OpenTelemetry setup:
- Export traces to Jaeger/OTLP collector
- Add database instrumentation (Ent interceptor)
- Add Kafka instrumentation
- Add Redis instrumentation
- Create custom spans:
- Module initialization spans
- Background job spans
- Event publishing spans
- Add trace context propagation:
- Include trace ID in logs
- Propagate across HTTP calls
- Include in error reports
6.2 Error Reporting (Sentry)
- Install
github.com/getsentry/sentry-go - Integrate with error bus:
- Send errors to Sentry
- Include trace ID in Sentry events
- Add user context (user ID, email)
- Add module context (module name)
- Add Sentry middleware:
- Capture panics
- Capture HTTP errors (4xx, 5xx)
- Configure Sentry DSN via config
6.3 Logging Enhancements
- Add request correlation:
- Generate unique request ID per request
- Include in all logs
- Return in response headers (
X-Request-ID)
- Add structured fields:
user_idfrom contexttenant_idfrom contextmodulename for module logstrace_idfrom OpenTelemetry
- Create log aggregation config:
- JSON format for production
- Human-readable for development
- Support for Loki/CloudWatch/ELK
6.4 Prometheus Metrics Expansion
- Add more metrics:
- Database connection pool stats
- Cache hit/miss ratio
- Event bus publish/consume rates
- Background job execution times
- Module-specific metrics (via module interface)
- Create metric labels:
modulelabel for module metricstenant_idlabel (if multi-tenant)statuslabel for error rates
6.5 Grafana Dashboards
- Create
ops/grafana/dashboards/:platform-overview.json- Overall healthhttp-metrics.json- HTTP request metricsdatabase-metrics.json- Database performancemodule-metrics.json- Per-module metricserror-rates.json- Error tracking
- Document dashboard setup in
docs/operations.md
6.6 Rate Limiting
- Install
github.com/ulule/limiter/v3 - Create rate limit middleware:
- Per-user rate limiting
- Per-IP rate limiting
- Configurable limits per endpoint
- Add rate limit config:
rate_limiting: enabled: true per_user: 100/minute per_ip: 1000/minute - Return
X-RateLimit-*headers
6.7 Security Hardening
- Add security headers middleware:
X-Content-Type-Options: nosniffX-Frame-Options: DENYX-XSS-Protection: 1; mode=blockStrict-Transport-Security(if HTTPS)Content-Security-Policy
- Add request size limits:
- Max body size (10MB default)
- Max header size
- Add input validation:
- Use
github.com/go-playground/validator - Validate all request bodies
- Sanitize user inputs
- Use
- Add SQL injection protection:
- Use parameterized queries (Ent already does this)
- Add linter rule to prevent raw SQL
6.8 Performance Optimization
- Add database connection pooling:
- Configure max connections
- Configure idle timeout
- Monitor pool stats
- Add query optimization:
- Add indexes for common queries
- Use database query logging (development)
- Add slow query detection
- Add response compression:
- Gzip middleware for large responses
- Add caching strategy:
- Cache frequently accessed data (user permissions, roles)
Deliverables
- ✅ Full OpenTelemetry integration
- ✅ Sentry error reporting
- ✅ Enhanced logging with correlation
- ✅ Comprehensive Prometheus metrics
- ✅ Grafana dashboards
- ✅ Rate limiting
- ✅ Security hardening
- ✅ Performance optimizations
Acceptance Criteria
- Traces are exported and visible in Jaeger
- Errors are reported to Sentry with context
- Logs include request IDs and trace IDs
- Metrics are exposed and scraped by Prometheus
- Rate limiting prevents abuse
- Security headers are present
- Performance meets SLA (< 100ms p95 for auth endpoints)
Phase 7: Testing, Documentation & CI/CD (Week 8-9)
Objectives
- Comprehensive test coverage (unit, integration, contract)
- Complete documentation
- Production-ready CI/CD pipeline
- Docker images and deployment guides
Tasks
7.1 Unit Tests
- Achieve >80% code coverage for core modules:
- Config loader
- Logger
- Auth service
- Permission resolver
- Module registry
- Use
github.com/stretchr/testifyfor assertions - Use
github.com/golang/mockormockeryfor mocks - Add test helpers:
testutil.NewTestDB()- In-memory SQLite for teststestutil.NewTestUser()- Create test usertestutil.NewTestContext()- Context with user
7.2 Integration Tests
- Install
github.com/testcontainers/testcontainers-go - Create integration test suite:
- Full HTTP request flow
- Database operations
- Event bus publishing/consuming
- Background job execution
- Test scenarios:
- User registration → login → API access
- Role assignment → permission check
- Module loading and initialization
- Multi-module interaction
- Create
docker-compose.test.yml:- PostgreSQL
- Redis
- Kafka (optional)
- Add test tags:
//go:build integration
7.3 Contract Tests
- Install
github.com/pact-foundation/pact-go(optional) - Create API contract tests:
- Verify API responses match OpenAPI spec
- Test backward compatibility
- Use OpenAPI validator:
- Install
github.com/getkin/kin-openapi - Validate request/response against OpenAPI spec
- Generate OpenAPI spec from code annotations
- Install
7.4 Load Testing
- Create
perf/directory with k6 scripts:perf/auth-load.js- Login endpoint load testperf/api-load.js- General API load test
- Document performance benchmarks:
- Request latency (p50, p95, p99)
- Throughput (requests/second)
- Resource usage (CPU, memory)
7.5 Documentation
- Create
README.md:- Quick start guide
- Architecture overview
- Installation instructions
- Development setup
- Create
docs/architecture.md:- System architecture diagram
- Module system explanation
- Extension points
- Create
docs/extension-points.md:- How to create a module
- Permission system
- Event bus usage
- Background jobs
- Create
docs/api.md:- API endpoints documentation
- Authentication flow
- Error codes
- Create
docs/operations.md:- Deployment guide
- Monitoring setup
- Troubleshooting
- Grafana dashboards
- Add code examples:
examples/directory with sample modules- Code comments and godoc
7.6 CI/CD Pipeline Enhancement
- Update
.github/workflows/ci.yml:- Run unit tests with coverage
- Run integration tests (with testcontainers)
- Run linters (golangci-lint, gosec)
- Generate coverage report
- Upload artifacts
- Add release workflow:
- Semantic versioning
- Tag releases
- Build and push Docker images
- Generate changelog
- Add security scanning:
gosecfor security issues- Dependabot for dependency updates
- Trivy for container scanning
7.7 Docker Images
- Create multi-stage
Dockerfile:# Build stage FROM golang:1.22-alpine AS builder # ... build commands # Runtime stage FROM gcr.io/distroless/static-debian12 # ... copy binary - Create
docker-compose.ymlfor development:- Platform service
- PostgreSQL
- Redis
- Kafka (optional)
- Create
docker-compose.prod.ymlfor production - Add health checks to Dockerfile
- Document Docker usage in
docs/deployment.md
7.8 Deployment Guides
- Create
docs/deployment/kubernetes.md:- Kubernetes manifests
- Helm chart (optional)
- Service definitions
- ConfigMap and Secret management
- Create
docs/deployment/docker.md:- Docker Compose deployment
- Environment variables
- Volume mounts
- Create
docs/deployment/cloud.md:- AWS/GCP/Azure deployment notes
- Managed service integration
- Load balancer configuration
7.9 Developer Experience
- Create
Makefilewith common tasks:make dev # Start dev environment make test # Run tests make lint # Run linters make generate # Generate code make docker-build # Build Docker image make migrate # Run migrations - Add development scripts:
scripts/dev.sh- Start all servicesscripts/test.sh- Run test suitescripts/seed.sh- Seed test data
- Create
.env.examplewith all config variables - Add pre-commit hooks (optional):
- Run linter
- Run tests
- Check formatting
Deliverables
- ✅ >80% test coverage
- ✅ Integration test suite
- ✅ Complete documentation
- ✅ Production CI/CD pipeline
- ✅ Docker images and deployment guides
- ✅ Developer tooling and scripts
Acceptance Criteria
- All tests pass in CI
- Code coverage >80%
- Documentation is complete and accurate
- Docker images build and run successfully
- Deployment guides are tested
- New developers can set up environment in <30 minutes
Phase 8: Advanced Features & Polish (Week 9-10, Optional)
Objectives
- Add advanced features (OIDC, GraphQL, API Gateway)
- Performance optimization
- Additional sample modules
- Final polish and bug fixes
Tasks
8.1 OpenID Connect (OIDC) Support
- Install
github.com/coreos/go-oidc - Implement OIDC provider:
- Discovery endpoint
- JWKS endpoint
- Token endpoint
- UserInfo endpoint
- Add OIDC client support:
- Validate tokens from external IdP
- Map claims to internal user
- Document OIDC setup in
docs/auth.md
8.2 GraphQL API (Optional)
- Install
github.com/99designs/gqlgen - Create GraphQL schema:
- User queries
- Blog queries
- Mutations
- Implement resolvers:
- Use existing services
- Add authorization checks
- Add GraphQL endpoint:
POST /graphql
8.3 API Gateway Features
- Add request/response transformation
- Add API key authentication
- Add request routing rules
- Add API versioning support
8.4 Additional Sample Modules
- Create
modules/notification/:- Email templates
- Notification preferences
- Notification history
- Create
modules/analytics/:- Event tracking
- Analytics dashboard API
- Export functionality
8.5 Performance Optimization
- Add database query caching
- Optimize N+1 queries
- Add response caching (Redis)
- Implement connection pooling optimizations
- Add database read replicas support
8.6 Internationalization (i18n)
- Install i18n library
- Add locale detection:
- From Accept-Language header
- From user preferences
- Create message catalogs
- Add translation support for error messages
8.7 Final Polish
- Code review and refactoring
- Bug fixes
- Performance profiling
- Security audit
- Documentation review
Deliverables
- ✅ OIDC support (optional)
- ✅ GraphQL API (optional)
- ✅ Additional sample modules
- ✅ Performance optimizations
- ✅ Final polish
Implementation Checklist Summary
Phase 0: Setup ✅
- Repository structure
- Configuration system
- Logging foundation
- Basic CI/CD
- DI setup
Phase 1: Core Kernel ✅
- DI container
- Database (Ent)
- Health & metrics
- Error bus
- HTTP server
- OpenTelemetry
Phase 2: Auth & Authorization ✅
- JWT authentication
- Identity management
- Roles & permissions
- Authorization middleware
- Audit logging
Phase 3: Module Framework ✅
- Module interface
- Static registry
- Permission generation
- Module loader
- Module initialization
Phase 4: Sample Module (Blog) ✅
- Blog module structure
- Domain model
- Repository & service
- API handlers
- Integration tests
Phase 5: Infrastructure ✅
- Cache (Redis)
- Event bus
- Blob storage
- Email notification
- Scheduler/jobs
- Multi-tenancy (optional)
Phase 6: Observability ✅
- OpenTelemetry
- Sentry integration
- Enhanced logging
- Prometheus metrics
- Grafana dashboards
- Rate limiting
- Security hardening
Phase 7: Testing & Docs ✅
- Unit tests (>80% coverage)
- Integration tests
- Documentation
- CI/CD pipeline
- Docker images
- Deployment guides
Phase 8: Advanced Features (Optional) ✅
- OIDC support
- GraphQL API
- Additional modules
- Performance optimization
Risk Mitigation
Technical Risks
| Risk | Impact | Mitigation |
|---|---|---|
| Circular import issues | High | Strict separation: interfaces in pkg/, implementations in internal/ |
| Plugin version mismatch | Medium | Prefer static registration; document version requirements |
| Database migration conflicts | Medium | Central migration orchestrator, dependency ordering |
| Performance bottlenecks | Low | Load testing in Phase 7, profiling, caching strategy |
| Security vulnerabilities | High | Security audit, gosec scanning, input validation |
Process Risks
| Risk | Impact | Mitigation |
|---|---|---|
| Scope creep | Medium | Stick to phased approach, defer optional features to Phase 8 |
| Incomplete documentation | Medium | Documentation as part of each phase, not afterthought |
| Testing gaps | High | Test coverage requirements, integration tests early |
Success Criteria
The platform is considered complete when:
- ✅ All core modules are implemented and tested
- ✅ Blog module serves as working reference
- ✅ Test coverage >80%
- ✅ Documentation is complete
- ✅ CI/CD pipeline is production-ready
- ✅ Docker images build and run
- ✅ Integration tests pass
- ✅ Security audit passes
- ✅ Performance meets SLA (<100ms p95 for auth)
- ✅ New developer can set up in <30 minutes
Next Steps After Implementation
- Gather Feedback: Share with team, collect requirements
- Iterate: Add features based on feedback
- Scale: Optimize for production load
- Extend: Add more modules as needed
- Community: Open source (if applicable), gather contributors
References
- playbook.md - General platform playbook
- playbook-golang.md - Go-specific playbook
- Go Modules Documentation
- Ent Documentation
- Uber FX Documentation
- OpenTelemetry Go
Document Version: 1.0
Status: Ready for Implementation