# Go Platform Implementation Plan **"Plug‑in‑friendly SaaS/Enterprise Platform – Go Edition"** > This document outlines a complete, phased implementation plan for building the Go platform boilerplate based on the requirements from `playbook.md` and `playbook-golang.md`. --- ## Executive Summary This plan breaks down the implementation into **8 phases**, each with specific deliverables and acceptance criteria. The approach prioritizes building a solid foundation (core kernel) before adding feature modules and advanced capabilities. **Total Estimated Timeline:** 8-12 weeks (depending on team size and parallelization) **Key Principles:** - **Clean/Hexagonal Architecture** with clear separation between `pkg/` (interfaces) and `internal/` (implementations) - **Dependency Injection** using `uber-go/fx` for lifecycle management - **Modular Monolith** design that can evolve into microservices - **Plugin-first** architecture supporting both static and dynamic module loading - **Security-by-Design** with JWT auth, RBAC/ABAC, and audit logging - **Observability** via OpenTelemetry, Prometheus, and structured logging --- ## Phase 0: Project Setup & Foundation (Week 1) ### Objectives - Initialize repository structure - Set up Go modules and basic tooling - Create configuration management foundation - Establish CI/CD skeleton ### Tasks #### 0.1 Repository Bootstrap - [ ] Initialize Go module: `go mod init github.com/yourorg/platform` - [ ] Create directory structure: ``` platform/ ├── cmd/ │ └── platform/ # Main entry point ├── internal/ # Private implementation code │ ├── di/ # Dependency injection container │ ├── registry/ # Module registry │ ├── pluginloader/ # Plugin loader (optional) │ └── infra/ # Infrastructure adapters ├── pkg/ # Public interfaces (exported) │ ├── config/ # ConfigProvider interface │ ├── logger/ # Logger interface │ ├── module/ # IModule interface │ ├── auth/ # Auth interfaces │ ├── perm/ # Permission DSL │ └── infra/ # Infrastructure interfaces ├── modules/ # Feature modules │ └── blog/ # Sample Blog module (Phase 4) ├── config/ # Configuration files │ ├── default.yaml │ ├── development.yaml │ └── production.yaml ├── api/ # OpenAPI specs ├── scripts/ # Build/test scripts ├── docs/ # Documentation ├── ops/ # Operations (Grafana dashboards, etc.) ├── .github/ │ └── workflows/ │ └── ci.yml ├── Dockerfile ├── docker-compose.yml ├── docker-compose.test.yml └── go.mod ``` - [ ] Add `.gitignore` for Go projects - [ ] Create initial `README.md` with project overview #### 0.2 Configuration System - [ ] Install `github.com/spf13/viper` and `github.com/spf13/cobra` - [ ] Create `pkg/config/config.go` interface: ```go type ConfigProvider interface { Get(key string) any Unmarshal(v any) error GetString(key string) string GetInt(key string) int GetBool(key string) bool } ``` - [ ] Implement `internal/config/config.go` using Viper: - Load `config/default.yaml` as baseline - Merge environment-specific YAML (development/production) - Apply environment variable overrides - Support secret manager integration (placeholder for Phase 6) - [ ] Create `config/default.yaml` with basic structure: ```yaml environment: development server: port: 8080 host: "0.0.0.0" database: driver: "postgres" dsn: "" logging: level: "info" format: "json" ``` - [ ] Add `internal/config/loader.go` with `LoadConfig()` function #### 0.3 Logging Foundation - [ ] Install `go.uber.org/zap` - [ ] Create `pkg/logger/logger.go` interface: ```go type Logger interface { Debug(msg string, fields ...Field) Info(msg string, fields ...Field) Warn(msg string, fields ...Field) Error(msg string, fields ...Field) With(fields ...Field) Logger } ``` - [ ] Implement `internal/logger/zap_logger.go`: - Structured JSON logging - Configurable log levels - Request-scoped fields support - Export global logger via `pkg/logger` - [ ] Add request ID middleware helper (Gin middleware) #### 0.4 Basic CI/CD Pipeline - [ ] Create `.github/workflows/ci.yml`: - Go 1.22 setup - Module caching - Linting (golangci-lint or staticcheck) - Unit tests (basic skeleton) - Build binary - [ ] Add `Makefile` with common commands: - `make test` - run tests - `make lint` - run linter - `make build` - build binary - `make docker-build` - build Docker image #### 0.5 Dependency Injection Setup - [ ] Install `go.uber.org/fx` - [ ] Create `internal/di/container.go`: - Initialize fx container - Register Config and Logger providers - Basic lifecycle hooks - [ ] Create `cmd/platform/main.go` skeleton: - Load config - Initialize DI container - Start minimal HTTP server (placeholder) ### Deliverables - ✅ Repository structure in place - ✅ Configuration system loads YAML files and env vars - ✅ Structured logging works - ✅ CI pipeline runs linting and builds binary - ✅ Basic DI container initialized ### Acceptance Criteria - `go build ./cmd/platform` succeeds - `go test ./...` runs (even if tests are empty) - CI pipeline passes on empty commit - Config loads from `config/default.yaml` --- ## Phase 1: Core Kernel & Infrastructure (Week 2-3) ### Objectives - Implement dependency injection container - Set up database (Ent ORM) - Create health and metrics endpoints - Implement error bus - Add basic HTTP server with middleware ### Tasks #### 1.1 Dependency Injection Container - [ ] Extend `internal/di/container.go`: - Register all core services - Provide lifecycle management via fx - Support service overrides - [ ] Create `internal/di/providers.go`: - `ProvideConfig()` - config provider - `ProvideLogger()` - logger - `ProvideDatabase()` - Ent client (after 1.2) - `ProvideHealthCheckers()` - health check registry - `ProvideMetrics()` - Prometheus registry - `ProvideErrorBus()` - error bus - [ ] Add `internal/di/core_module.go`: - Export `CoreModule` fx.Option that provides all core services #### 1.2 Database Setup (Ent) - [ ] Install `entgo.io/ent/cmd/ent` - [ ] Initialize Ent schema: ```bash go run entgo.io/ent/cmd/ent init User Role Permission AuditLog ``` - [ ] Define core entities in `internal/ent/schema/`: - `user.go`: ID, email, password_hash, verified, created_at, updated_at - `role.go`: ID, name, description, created_at - `permission.go`: ID, name (string format: "module.resource.action") - `audit_log.go`: ID, actor_id, action, target_id, metadata (JSON), timestamp - `role_permissions.go`: Many-to-many relationship - `user_roles.go`: Many-to-many relationship - [ ] Generate Ent code: `go generate ./internal/ent` - [ ] Create `internal/infra/database/client.go`: - `NewEntClient(dsn string) (*ent.Client, error)` - Connection pooling configuration - Migration runner wrapper - [ ] Add database config to `config/default.yaml` #### 1.3 Health & Metrics - [ ] Install `github.com/prometheus/client_golang/prometheus` - [ ] Install `github.com/heptiolabs/healthcheck` (optional, or custom) - [ ] Create `pkg/health/health.go` interface: ```go type HealthChecker interface { Check(ctx context.Context) error } ``` - [ ] Implement `internal/health/registry.go`: - Registry of health checkers - `/healthz` endpoint (liveness) - `/ready` endpoint (readiness with DB check) - [ ] Create `internal/metrics/metrics.go`: - HTTP request duration histogram - HTTP request counter - Database query duration (via Ent interceptor) - Error counter - [ ] Add `/metrics` endpoint (Prometheus format) - [ ] Register endpoints in main HTTP router #### 1.4 Error Bus - [ ] Create `pkg/errorbus/errorbus.go` interface: ```go type ErrorPublisher interface { Publish(err error) } ``` - [ ] Implement `internal/errorbus/channel_bus.go`: - Channel-based error bus - Background goroutine consumes errors - Log all errors - Optional: Sentry integration (Phase 6) - [ ] Add panic recovery middleware that publishes to error bus - [ ] Register error bus in DI container #### 1.5 HTTP Server Foundation - [ ] Install `github.com/gin-gonic/gin` - [ ] Create `internal/server/server.go`: - Initialize Gin router - Add middleware: - Request ID generator - Structured logging - Panic recovery → error bus - Prometheus metrics - CORS (configurable) - Register core routes: - `GET /healthz` - `GET /ready` - `GET /metrics` - [ ] Wire HTTP server into fx lifecycle: - Start on `OnStart` - Graceful shutdown on `OnStop` - [ ] Update `cmd/platform/main.go` to use fx lifecycle #### 1.6 OpenTelemetry Setup - [ ] Install OpenTelemetry packages: - `go.opentelemetry.io/otel` - `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp` - [ ] Create `internal/observability/tracer.go`: - Initialize OTEL TracerProvider - Export to stdout (development) or OTLP (production) - [ ] Add HTTP instrumentation middleware - [ ] Add trace context propagation to requests ### Deliverables - ✅ DI container with all core services - ✅ Database client with Ent schema - ✅ Health and metrics endpoints functional - ✅ Error bus captures and logs errors - ✅ HTTP server with middleware stack - ✅ Basic observability with OpenTelemetry ### Acceptance Criteria - `GET /healthz` returns 200 - `GET /ready` checks DB connectivity - `GET /metrics` exposes Prometheus metrics - Panic recovery logs errors via error bus - Database migrations run on startup - HTTP requests are traced with OpenTelemetry --- ## Phase 2: Authentication & Authorization (Week 3-4) ### Objectives - Implement JWT authentication - Create identity management (User CRUD) - Build role and permission system - Add authorization middleware - Implement audit logging ### Tasks #### 2.1 Authentication (JWT) - [ ] Install `github.com/golang-jwt/jwt/v5` - [ ] Create `pkg/auth/auth.go` interfaces: ```go type Authenticator interface { GenerateToken(userID string, roles []string, tenantID string) (string, error) VerifyToken(token string) (*TokenClaims, error) } type TokenClaims struct { UserID string Roles []string TenantID string ExpiresAt time.Time } ``` - [ ] Implement `internal/auth/jwt_auth.go`: - Generate access tokens (short-lived, 15min) - Generate refresh tokens (long-lived, 7 days) - Verify token signature and expiration - Extract claims - [ ] Create `internal/auth/middleware.go`: - Extract JWT from `Authorization: Bearer ` header - Verify token - Inject `User` into `context.Context` - Helper: `auth.FromContext(ctx) *User` - [ ] Add login endpoint: `POST /api/v1/auth/login` - Validate credentials - Return access + refresh tokens - [ ] Add refresh endpoint: `POST /api/v1/auth/refresh` - Validate refresh token - Issue new access token #### 2.2 Identity Management - [ ] Create `pkg/identity/identity.go` interfaces: ```go type UserRepository interface { FindByID(ctx context.Context, id string) (*User, error) FindByEmail(ctx context.Context, email string) (*User, error) Create(ctx context.Context, u *User) error Update(ctx context.Context, u *User) error Delete(ctx context.Context, id string) error } type UserService interface { Register(ctx context.Context, email, password string) (*User, error) VerifyEmail(ctx context.Context, token string) error ResetPassword(ctx context.Context, email string) error ChangePassword(ctx context.Context, userID, oldPassword, newPassword string) error } ``` - [ ] Implement `internal/identity/user_repo.go` using Ent: - CRUD operations - Password hashing (bcrypt or argon2) - Email verification flow - [ ] Implement `internal/identity/user_service.go`: - User registration with email verification - Password reset flow (token-based) - Password change - Email verification - [ ] Add endpoints: - `POST /api/v1/users` - Register - `GET /api/v1/users/:id` - Get user - `PUT /api/v1/users/:id` - Update user - `POST /api/v1/users/verify-email` - Verify email - `POST /api/v1/users/reset-password` - Request reset - `POST /api/v1/users/change-password` - Change password #### 2.3 Roles & Permissions - [ ] Create `pkg/perm/perm.go`: ```go type Permission string // Core permissions var ( SystemHealthCheck Permission = "system.health.check" UserCreate Permission = "user.create" UserRead Permission = "user.read" UserUpdate Permission = "user.update" UserDelete Permission = "user.delete" RoleCreate Permission = "role.create" RoleRead Permission = "role.read" RoleUpdate Permission = "role.update" RoleDelete Permission = "role.delete" ) ``` - [ ] Create `pkg/perm/resolver.go` interface: ```go type PermissionResolver interface { HasPermission(ctx context.Context, userID string, perm Permission) (bool, error) GetUserPermissions(ctx context.Context, userID string) ([]Permission, error) } ``` - [ ] Implement `internal/perm/in_memory_resolver.go`: - Load user roles from DB - Load role permissions from DB - Check if user has specific permission - Cache permission lookups (optional) - [ ] Create `pkg/auth/authz.go` interface: ```go type Authorizer interface { Authorize(ctx context.Context, perm Permission) error } ``` - [ ] Implement `internal/auth/rbac_authorizer.go`: - Extract user from context - Check permission via PermissionResolver - Return error if unauthorized - [ ] Create authorization middleware: - Decorator pattern: `RequirePermission(perm Permission) gin.HandlerFunc` - Use with route registration #### 2.4 Role Management API - [ ] Create `internal/identity/role_repo.go`: - CRUD for roles - Assign permissions to roles - Assign roles to users - [ ] Add endpoints: - `POST /api/v1/roles` - Create role - `GET /api/v1/roles` - List roles - `GET /api/v1/roles/:id` - Get role - `PUT /api/v1/roles/:id` - Update role - `DELETE /api/v1/roles/:id` - Delete role - `POST /api/v1/roles/:id/permissions` - Assign permissions - `POST /api/v1/users/:id/roles` - Assign roles to user #### 2.5 Audit Logging - [ ] Create `pkg/audit/audit.go` interface: ```go type Auditor interface { Record(ctx context.Context, act AuditAction) error } type AuditAction struct { ActorID string Action string TargetID string Metadata map[string]any } ``` - [ ] Implement `internal/audit/ent_auditor.go`: - Write to `audit_log` table - Capture actor from context - Include request ID, IP address, user agent - [ ] Add audit middleware: - Intercept all authenticated requests - Record action (method + path) - Store in audit log - [ ] Integrate with auth endpoints: - Log login attempts (success/failure) - Log password changes - Log role assignments #### 2.6 Seed Data - [ ] Create `internal/seed/seed.go`: - Create default admin user (if doesn't exist) - Create default roles (admin, user, guest) - Assign permissions to roles - Script: `go run cmd/seed/main.go` ### Deliverables - ✅ JWT authentication with access/refresh tokens - ✅ User CRUD with email verification - ✅ Role and permission management - ✅ Authorization middleware - ✅ Audit logging for all actions - ✅ Seed script for initial data ### Acceptance Criteria - User can register and login - JWT tokens are validated on protected routes - Users without permission get 403 - All actions are logged in audit table - Admin can create roles and assign permissions - Integration test: user without permission cannot access protected resource --- ## Phase 3: Module Framework (Week 4-5) ### Objectives - Define module interface and registration system - Implement static module registry - Create permission code generation tool - Build module loader (support both static and plugin modes) - Add module discovery and initialization ### Tasks #### 3.1 Module Interface - [ ] Create `pkg/module/module.go`: ```go type IModule interface { Name() string Version() string Dependencies() []string Init() fx.Option Migrations() []func(*ent.Client) error } ``` - [ ] Create `pkg/module/manifest.go`: ```go type Manifest struct { Name string Version string Dependencies []string Permissions []string Routes []Route } ``` - [ ] Define `module.yaml` schema (used for code generation) #### 3.2 Static Module Registry - [ ] Create `internal/registry/registry.go`: - Thread-safe module map - `Register(m IModule)` function - `All() []IModule` function - `Get(name string) (IModule, error)` function - [ ] Add registration validation: - Check dependencies are satisfied - Check for duplicate names - Validate version compatibility #### 3.3 Permission Code Generation - [ ] Create `scripts/generate-permissions.go`: - Scan all `modules/*/module.yaml` files - Extract permissions from manifests - Generate `pkg/perm/generated.go`: ```go // Code generated by generate-permissions. DO NOT EDIT. var ( BlogPostCreate Permission = "blog.post.create" BlogPostRead Permission = "blog.post.read" // ... ) ``` - [ ] Add `//go:generate` directive to `pkg/perm/perm.go` - [ ] Update `Makefile` with `make generate` command #### 3.4 Module Loader - [ ] Create `internal/pluginloader/loader.go`: - Support static registration (preferred) - Optional: support Go plugin loading (`.so` files) - Scan `modules/*/module.yaml` for discovery - Load modules in dependency order - [ ] Implement `internal/pluginloader/static_loader.go`: - Import modules via `import _ "github.com/yourorg/blog"` (side-effect registration) - Collect all registered modules - [ ] Implement `internal/pluginloader/plugin_loader.go` (optional): - Scan `./plugins/*.so` - Load via `plugin.Open()` - Extract `Module` symbol - Validate version compatibility #### 3.5 Module Initialization - [ ] Create `internal/module/initializer.go`: - Collect all registered modules - Resolve dependency order (topological sort) - Initialize each module's `Init()` fx.Option - Merge all options into main fx container - [ ] Run migrations: - Collect all module migrations - Run core migrations first - Run module migrations in dependency order - Handle migration errors gracefully #### 3.6 Module Lifecycle Hooks - [ ] Extend `pkg/module/module.go`: ```go type IModule interface { // ... existing methods OnStart(ctx context.Context) error // Optional OnStop(ctx context.Context) error // Optional } ``` - [ ] Integrate with fx.Lifecycle: - Call `OnStart` during app startup - Call `OnStop` during graceful shutdown #### 3.7 Module CLI Tool - [ ] Create `cmd/platformctl/main.go`: - `platformctl modules list` - List all loaded modules - `platformctl modules validate` - Validate module dependencies - `platformctl modules test ` - Test module loading - [ ] Add to `Makefile`: `make install-cli` ### Deliverables - ✅ Module interface and registration system - ✅ Static module registry working - ✅ Permission code generation tool - ✅ Module loader with dependency resolution - ✅ Module initialization in main app - ✅ CLI tool for module management ### Acceptance Criteria - Modules can register via `registry.Register()` - Permission constants are generated from `module.yaml` - Modules load in correct dependency order - Module migrations run on startup - `platformctl modules list` shows all modules - Integration test: load multiple modules and verify initialization --- ## Phase 4: Sample Feature Module (Blog) (Week 5-6) ### Objectives - Create a complete sample module (Blog) to demonstrate the framework - Show how to add routes, permissions, database entities, and services - Provide reference implementation for future developers ### Tasks #### 4.1 Blog Module Structure - [ ] Create `modules/blog/` directory: ``` modules/blog/ ├── go.mod ├── module.yaml ├── internal/ │ ├── api/ │ │ └── handler.go │ ├── domain/ │ │ ├── post.go │ │ └── post_repo.go │ └── service/ │ └── post_service.go └── pkg/ └── module.go ``` - [ ] Initialize `go.mod`: ```bash cd modules/blog go mod init github.com/yourorg/blog ``` #### 4.2 Module Manifest - [ ] Create `modules/blog/module.yaml`: ```yaml name: blog version: 0.1.0 dependencies: - core >= 1.0.0 permissions: - blog.post.create - blog.post.read - blog.post.update - blog.post.delete routes: - method: POST path: /api/v1/blog/posts permission: blog.post.create - method: GET path: /api/v1/blog/posts/:id permission: blog.post.read - method: PUT path: /api/v1/blog/posts/:id permission: blog.post.update - method: DELETE path: /api/v1/blog/posts/:id permission: blog.post.delete - method: GET path: /api/v1/blog/posts permission: blog.post.read ``` #### 4.3 Blog Domain Model - [ ] Create `modules/blog/internal/domain/post.go`: ```go type Post struct { ID string Title string Content string AuthorID string CreatedAt time.Time UpdatedAt time.Time } ``` - [ ] Create Ent schema `modules/blog/internal/ent/schema/post.go`: - Fields: title, content, author_id (FK to user) - Indexes: author_id, created_at - [ ] Generate Ent code for blog module #### 4.4 Blog Repository - [ ] Create `modules/blog/internal/domain/post_repo.go`: ```go type PostRepository interface { Create(ctx context.Context, p *Post) (*Post, error) FindByID(ctx context.Context, id string) (*Post, error) FindByAuthor(ctx context.Context, authorID string) ([]*Post, error) Update(ctx context.Context, p *Post) error Delete(ctx context.Context, id string) error } ``` - [ ] Implement using Ent client (shared from core) #### 4.5 Blog Service - [ ] Create `modules/blog/internal/service/post_service.go`: - Business logic for creating/updating posts - Validation (title length, content requirements) - Authorization checks (author can only update own posts) - Integration with audit system #### 4.6 Blog API Handlers - [ ] Create `modules/blog/internal/api/handler.go`: - `POST /api/v1/blog/posts` - Create post - `GET /api/v1/blog/posts/:id` - Get post - `GET /api/v1/blog/posts` - List posts (with pagination) - `PUT /api/v1/blog/posts/:id` - Update post - `DELETE /api/v1/blog/posts/:id` - Delete post - [ ] Use authorization middleware: ```go grp.Use(auth.RequirePermission(perm.BlogPostCreate)) ``` - [ ] Register handlers in module's `Init()` #### 4.7 Blog Module Implementation - [ ] Create `modules/blog/pkg/module.go`: ```go type BlogModule struct{} func (b BlogModule) Name() string { return "blog" } func (b BlogModule) Version() string { return "0.1.0" } func (b BlogModule) Dependencies() []string { return nil } func (b BlogModule) Init() fx.Option { return fx.Options( fx.Provide(NewPostRepo), fx.Provide(NewPostService), fx.Invoke(RegisterHandlers), ) } func (b BlogModule) Migrations() []func(*ent.Client) error { return []func(*ent.Client) error{ func(c *ent.Client) error { return c.Schema.Create(context.Background()) }, } } var Module BlogModule func init() { registry.Register(Module) } ``` #### 4.8 Integration - [ ] Update main `go.mod` to include blog module: ```go replace github.com/yourorg/blog => ./modules/blog ``` - [ ] Import blog module in `cmd/platform/main.go`: ```go import _ "github.com/yourorg/blog/pkg" ``` - [ ] Run permission generation: `make generate` - [ ] Verify blog permissions are generated #### 4.9 Tests - [ ] Create integration test `modules/blog/internal/api/handler_test.go`: - Test creating post with valid permission - Test creating post without permission (403) - Test updating own post vs other's post - Test pagination - [ ] Add unit tests for service and repository ### Deliverables - ✅ Complete Blog module with CRUD operations - ✅ Module registered and loaded by core - ✅ Permissions generated and used - ✅ Routes protected with authorization - ✅ Database migrations run - ✅ Integration tests passing ### Acceptance Criteria - Blog module loads on platform startup - `POST /api/v1/blog/posts` requires `blog.post.create` permission - User can create, read, update, delete posts - Authorization enforced (users can only edit own posts) - Integration test: full CRUD flow works - Audit logs record all blog actions --- ## Phase 5: Infrastructure Adapters (Week 6-7) ### Objectives - Implement infrastructure adapters (cache, queue, blob storage, email) - Make adapters swappable via interfaces - Add scheduler/background jobs system - Implement event bus (in-process and Kafka) ### Tasks #### 5.1 Cache (Redis) - [ ] Install `github.com/redis/go-redis/v9` - [ ] Create `pkg/infra/cache/cache.go` interface: ```go type Cache interface { Get(ctx context.Context, key string) ([]byte, error) Set(ctx context.Context, key string, value []byte, ttl time.Duration) error Delete(ctx context.Context, key string) error } ``` - [ ] Implement `internal/infra/cache/redis_cache.go` - [ ] Add Redis config to `config/default.yaml` - [ ] Register in DI container - [ ] Add cache middleware for selected routes (optional) #### 5.2 Event Bus - [ ] Create `pkg/eventbus/eventbus.go` interface: ```go type EventBus interface { Publish(ctx context.Context, topic string, event Event) error Subscribe(topic string, handler EventHandler) error } ``` - [ ] Implement `internal/infra/bus/inprocess_bus.go`: - Channel-based in-process bus - Used for testing and development - [ ] Implement `internal/infra/bus/kafka_bus.go`: - Install `github.com/segmentio/kafka-go` - Producer for publishing - Consumer groups for subscribing - Error handling and retries - [ ] Add Kafka config to `config/default.yaml` - [ ] Register bus in DI container (switchable via config) - [ ] Add core events: - `platform.user.created` - `platform.user.updated` - `platform.role.assigned` - `platform.permission.granted` #### 5.3 Blob Storage - [ ] Install `github.com/aws/aws-sdk-go-v2/service/s3` - [ ] Create `pkg/infra/blob/blob.go` interface: ```go type BlobStore interface { Upload(ctx context.Context, key string, data []byte) error Download(ctx context.Context, key string) ([]byte, error) Delete(ctx context.Context, key string) error GetSignedURL(ctx context.Context, key string, ttl time.Duration) (string, error) } ``` - [ ] Implement `internal/infra/blob/s3_store.go` - [ ] Add S3 config to `config/default.yaml` - [ ] Register in DI container - [ ] Add file upload endpoint: `POST /api/v1/files/upload` #### 5.4 Email Notification - [ ] Install `github.com/go-mail/mail` - [ ] Create `pkg/notification/notification.go` interface: ```go type Notifier interface { SendEmail(ctx context.Context, to, subject, body string) error SendSMS(ctx context.Context, to, message string) error } ``` - [ ] Implement `internal/infra/email/smtp_notifier.go`: - SMTP configuration - HTML email support - Templates for common emails (verification, password reset) - [ ] Add email config to `config/default.yaml` - [ ] Integrate with identity service: - Send verification email on registration - Send password reset email - [ ] Register in DI container #### 5.5 Scheduler & Background Jobs - [ ] Install `github.com/robfig/cron/v3` and `github.com/hibiken/asynq` - [ ] Create `pkg/scheduler/scheduler.go` interface: ```go type Scheduler interface { Cron(spec string, job JobFunc) error Enqueue(queue string, payload any) error } ``` - [ ] Implement `internal/infra/scheduler/asynq_scheduler.go`: - Redis-backed job queue - Cron jobs for periodic tasks - Job retries and backoff - Job status tracking - [ ] Create `internal/infra/scheduler/job_registry.go`: - Register jobs from modules - Start job processor on app startup - [ ] Add example jobs: - Cleanup expired tokens (daily) - Send digest emails (weekly) - [ ] Add job monitoring endpoint: `GET /api/v1/jobs/status` #### 5.6 Secret Store Integration - [ ] Create `pkg/infra/secret/secret.go` interface: ```go type SecretStore interface { GetSecret(ctx context.Context, key string) (string, error) } ``` - [ ] Implement `internal/infra/secret/vault_store.go` (HashiCorp Vault): - Install `github.com/hashicorp/vault/api` - Support KV v2 secrets - [ ] Implement `internal/infra/secret/aws_secrets.go` (AWS Secrets Manager): - Install `github.com/aws/aws-sdk-go-v2/service/secretsmanager` - [ ] Integrate with config loader: - Overlay secrets on top of file/env config - Load secrets lazily (cache) - [ ] Register in DI container (optional, via config) #### 5.7 Multi-tenancy Support (Optional) - [ ] Create `pkg/tenant/tenant.go` interface: ```go type TenantResolver interface { Resolve(ctx context.Context) (string, error) } ``` - [ ] Implement `internal/tenant/resolver.go`: - Extract from header: `X-Tenant-ID` - Extract from subdomain - Extract from JWT claim - [ ] Add tenant middleware: - Resolve tenant ID - Inject into context - Helper: `tenant.FromContext(ctx) string` - [ ] Update Ent queries to filter by tenant_id: - Add interceptor to Ent client - Automatically add `WHERE tenant_id = ?` to queries - [ ] Update User entity to include tenant_id ### Deliverables - ✅ Cache adapter (Redis) working - ✅ Event bus (in-process and Kafka) functional - ✅ Blob storage (S3) adapter - ✅ Email notification system - ✅ Scheduler and background jobs - ✅ Secret store integration (optional) - ✅ Multi-tenancy support (optional) ### Acceptance Criteria - Cache stores and retrieves data correctly - Events are published and consumed - Files can be uploaded and downloaded - Email notifications are sent - Background jobs run on schedule - Integration test: full infrastructure stack works --- ## Phase 6: Observability & Production Readiness (Week 7-8) ### Objectives - Enhance observability with full OpenTelemetry integration - Add comprehensive error reporting (Sentry) - Create Grafana dashboards - Improve logging with request correlation - Add rate limiting and security hardening ### Tasks #### 6.1 OpenTelemetry Enhancement - [ ] Complete OpenTelemetry setup: - Export traces to Jaeger/OTLP collector - Add database instrumentation (Ent interceptor) - Add Kafka instrumentation - Add Redis instrumentation - [ ] Create custom spans: - Module initialization spans - Background job spans - Event publishing spans - [ ] Add trace context propagation: - Include trace ID in logs - Propagate across HTTP calls - Include in error reports #### 6.2 Error Reporting (Sentry) - [ ] Install `github.com/getsentry/sentry-go` - [ ] Integrate with error bus: - Send errors to Sentry - Include trace ID in Sentry events - Add user context (user ID, email) - Add module context (module name) - [ ] Add Sentry middleware: - Capture panics - Capture HTTP errors (4xx, 5xx) - [ ] Configure Sentry DSN via config #### 6.3 Logging Enhancements - [ ] Add request correlation: - Generate unique request ID per request - Include in all logs - Return in response headers (`X-Request-ID`) - [ ] Add structured fields: - `user_id` from context - `tenant_id` from context - `module` name for module logs - `trace_id` from OpenTelemetry - [ ] Create log aggregation config: - JSON format for production - Human-readable for development - Support for Loki/CloudWatch/ELK #### 6.4 Prometheus Metrics Expansion - [ ] Add more metrics: - Database connection pool stats - Cache hit/miss ratio - Event bus publish/consume rates - Background job execution times - Module-specific metrics (via module interface) - [ ] Create metric labels: - `module` label for module metrics - `tenant_id` label (if multi-tenant) - `status` label for error rates #### 6.5 Grafana Dashboards - [ ] Create `ops/grafana/dashboards/`: - `platform-overview.json` - Overall health - `http-metrics.json` - HTTP request metrics - `database-metrics.json` - Database performance - `module-metrics.json` - Per-module metrics - `error-rates.json` - Error tracking - [ ] Document dashboard setup in `docs/operations.md` #### 6.6 Rate Limiting - [ ] Install `github.com/ulule/limiter/v3` - [ ] Create rate limit middleware: - Per-user rate limiting - Per-IP rate limiting - Configurable limits per endpoint - [ ] Add rate limit config: ```yaml rate_limiting: enabled: true per_user: 100/minute per_ip: 1000/minute ``` - [ ] Return `X-RateLimit-*` headers #### 6.7 Security Hardening - [ ] Add security headers middleware: - `X-Content-Type-Options: nosniff` - `X-Frame-Options: DENY` - `X-XSS-Protection: 1; mode=block` - `Strict-Transport-Security` (if HTTPS) - `Content-Security-Policy` - [ ] Add request size limits: - Max body size (10MB default) - Max header size - [ ] Add input validation: - Use `github.com/go-playground/validator` - Validate all request bodies - Sanitize user inputs - [ ] Add SQL injection protection: - Use parameterized queries (Ent already does this) - Add linter rule to prevent raw SQL #### 6.8 Performance Optimization - [ ] Add database connection pooling: - Configure max connections - Configure idle timeout - Monitor pool stats - [ ] Add query optimization: - Add indexes for common queries - Use database query logging (development) - Add slow query detection - [ ] Add response compression: - Gzip middleware for large responses - [ ] Add caching strategy: - Cache frequently accessed data (user permissions, roles) ### Deliverables - ✅ Full OpenTelemetry integration - ✅ Sentry error reporting - ✅ Enhanced logging with correlation - ✅ Comprehensive Prometheus metrics - ✅ Grafana dashboards - ✅ Rate limiting - ✅ Security hardening - ✅ Performance optimizations ### Acceptance Criteria - Traces are exported and visible in Jaeger - Errors are reported to Sentry with context - Logs include request IDs and trace IDs - Metrics are exposed and scraped by Prometheus - Rate limiting prevents abuse - Security headers are present - Performance meets SLA (< 100ms p95 for auth endpoints) --- ## Phase 7: Testing, Documentation & CI/CD (Week 8-9) ### Objectives - Comprehensive test coverage (unit, integration, contract) - Complete documentation - Production-ready CI/CD pipeline - Docker images and deployment guides ### Tasks #### 7.1 Unit Tests - [ ] Achieve >80% code coverage for core modules: - Config loader - Logger - Auth service - Permission resolver - Module registry - [ ] Use `github.com/stretchr/testify` for assertions - [ ] Use `github.com/golang/mock` or `mockery` for mocks - [ ] Add test helpers: - `testutil.NewTestDB()` - In-memory SQLite for tests - `testutil.NewTestUser()` - Create test user - `testutil.NewTestContext()` - Context with user #### 7.2 Integration Tests - [ ] Install `github.com/testcontainers/testcontainers-go` - [ ] Create integration test suite: - Full HTTP request flow - Database operations - Event bus publishing/consuming - Background job execution - [ ] Test scenarios: - User registration → login → API access - Role assignment → permission check - Module loading and initialization - Multi-module interaction - [ ] Create `docker-compose.test.yml`: - PostgreSQL - Redis - Kafka (optional) - [ ] Add test tags: `//go:build integration` #### 7.3 Contract Tests - [ ] Install `github.com/pact-foundation/pact-go` (optional) - [ ] Create API contract tests: - Verify API responses match OpenAPI spec - Test backward compatibility - [ ] Use OpenAPI validator: - Install `github.com/getkin/kin-openapi` - Validate request/response against OpenAPI spec - Generate OpenAPI spec from code annotations #### 7.4 Load Testing - [ ] Create `perf/` directory with k6 scripts: - `perf/auth-load.js` - Login endpoint load test - `perf/api-load.js` - General API load test - [ ] Document performance benchmarks: - Request latency (p50, p95, p99) - Throughput (requests/second) - Resource usage (CPU, memory) #### 7.5 Documentation - [ ] Create `README.md`: - Quick start guide - Architecture overview - Installation instructions - Development setup - [ ] Create `docs/architecture.md`: - System architecture diagram - Module system explanation - Extension points - [ ] Create `docs/extension-points.md`: - How to create a module - Permission system - Event bus usage - Background jobs - [ ] Create `docs/api.md`: - API endpoints documentation - Authentication flow - Error codes - [ ] Create `docs/operations.md`: - Deployment guide - Monitoring setup - Troubleshooting - Grafana dashboards - [ ] Add code examples: - `examples/` directory with sample modules - Code comments and godoc #### 7.6 CI/CD Pipeline Enhancement - [ ] Update `.github/workflows/ci.yml`: - Run unit tests with coverage - Run integration tests (with testcontainers) - Run linters (golangci-lint, gosec) - Generate coverage report - Upload artifacts - [ ] Add release workflow: - Semantic versioning - Tag releases - Build and push Docker images - Generate changelog - [ ] Add security scanning: - `gosec` for security issues - Dependabot for dependency updates - Trivy for container scanning #### 7.7 Docker Images - [ ] Create multi-stage `Dockerfile`: ```dockerfile # Build stage FROM golang:1.22-alpine AS builder # ... build commands # Runtime stage FROM gcr.io/distroless/static-debian12 # ... copy binary ``` - [ ] Create `docker-compose.yml` for development: - Platform service - PostgreSQL - Redis - Kafka (optional) - [ ] Create `docker-compose.prod.yml` for production - [ ] Add health checks to Dockerfile - [ ] Document Docker usage in `docs/deployment.md` #### 7.8 Deployment Guides - [ ] Create `docs/deployment/kubernetes.md`: - Kubernetes manifests - Helm chart (optional) - Service definitions - ConfigMap and Secret management - [ ] Create `docs/deployment/docker.md`: - Docker Compose deployment - Environment variables - Volume mounts - [ ] Create `docs/deployment/cloud.md`: - AWS/GCP/Azure deployment notes - Managed service integration - Load balancer configuration #### 7.9 Developer Experience - [ ] Create `Makefile` with common tasks: ```makefile make dev # Start dev environment make test # Run tests make lint # Run linters make generate # Generate code make docker-build # Build Docker image make migrate # Run migrations ``` - [ ] Add development scripts: - `scripts/dev.sh` - Start all services - `scripts/test.sh` - Run test suite - `scripts/seed.sh` - Seed test data - [ ] Create `.env.example` with all config variables - [ ] Add pre-commit hooks (optional): - Run linter - Run tests - Check formatting ### Deliverables - ✅ >80% test coverage - ✅ Integration test suite - ✅ Complete documentation - ✅ Production CI/CD pipeline - ✅ Docker images and deployment guides - ✅ Developer tooling and scripts ### Acceptance Criteria - All tests pass in CI - Code coverage >80% - Documentation is complete and accurate - Docker images build and run successfully - Deployment guides are tested - New developers can set up environment in <30 minutes --- ## Phase 8: Advanced Features & Polish (Week 9-10, Optional) ### Objectives - Add advanced features (OIDC, GraphQL, API Gateway) - Performance optimization - Additional sample modules - Final polish and bug fixes ### Tasks #### 8.1 OpenID Connect (OIDC) Support - [ ] Install `github.com/coreos/go-oidc` - [ ] Implement OIDC provider: - Discovery endpoint - JWKS endpoint - Token endpoint - UserInfo endpoint - [ ] Add OIDC client support: - Validate tokens from external IdP - Map claims to internal user - [ ] Document OIDC setup in `docs/auth.md` #### 8.2 GraphQL API (Optional) - [ ] Install `github.com/99designs/gqlgen` - [ ] Create GraphQL schema: - User queries - Blog queries - Mutations - [ ] Implement resolvers: - Use existing services - Add authorization checks - [ ] Add GraphQL endpoint: `POST /graphql` #### 8.3 API Gateway Features - [ ] Add request/response transformation - [ ] Add API key authentication - [ ] Add request routing rules - [ ] Add API versioning support #### 8.4 Additional Sample Modules - [ ] Create `modules/notification/`: - Email templates - Notification preferences - Notification history - [ ] Create `modules/analytics/`: - Event tracking - Analytics dashboard API - Export functionality #### 8.5 Performance Optimization - [ ] Add database query caching - [ ] Optimize N+1 queries - [ ] Add response caching (Redis) - [ ] Implement connection pooling optimizations - [ ] Add database read replicas support #### 8.6 Internationalization (i18n) - [ ] Install i18n library - [ ] Add locale detection: - From Accept-Language header - From user preferences - [ ] Create message catalogs - [ ] Add translation support for error messages #### 8.7 Final Polish - [ ] Code review and refactoring - [ ] Bug fixes - [ ] Performance profiling - [ ] Security audit - [ ] Documentation review ### Deliverables - ✅ OIDC support (optional) - ✅ GraphQL API (optional) - ✅ Additional sample modules - ✅ Performance optimizations - ✅ Final polish --- ## Implementation Checklist Summary ### Phase 0: Setup ✅ - [ ] Repository structure - [ ] Configuration system - [ ] Logging foundation - [ ] Basic CI/CD - [ ] DI setup ### Phase 1: Core Kernel ✅ - [ ] DI container - [ ] Database (Ent) - [ ] Health & metrics - [ ] Error bus - [ ] HTTP server - [ ] OpenTelemetry ### Phase 2: Auth & Authorization ✅ - [ ] JWT authentication - [ ] Identity management - [ ] Roles & permissions - [ ] Authorization middleware - [ ] Audit logging ### Phase 3: Module Framework ✅ - [ ] Module interface - [ ] Static registry - [ ] Permission generation - [ ] Module loader - [ ] Module initialization ### Phase 4: Sample Module (Blog) ✅ - [ ] Blog module structure - [ ] Domain model - [ ] Repository & service - [ ] API handlers - [ ] Integration tests ### Phase 5: Infrastructure ✅ - [ ] Cache (Redis) - [ ] Event bus - [ ] Blob storage - [ ] Email notification - [ ] Scheduler/jobs - [ ] Multi-tenancy (optional) ### Phase 6: Observability ✅ - [ ] OpenTelemetry - [ ] Sentry integration - [ ] Enhanced logging - [ ] Prometheus metrics - [ ] Grafana dashboards - [ ] Rate limiting - [ ] Security hardening ### Phase 7: Testing & Docs ✅ - [ ] Unit tests (>80% coverage) - [ ] Integration tests - [ ] Documentation - [ ] CI/CD pipeline - [ ] Docker images - [ ] Deployment guides ### Phase 8: Advanced Features (Optional) ✅ - [ ] OIDC support - [ ] GraphQL API - [ ] Additional modules - [ ] Performance optimization --- ## Risk Mitigation ### Technical Risks | Risk | Impact | Mitigation | |------|--------|------------| | **Circular import issues** | High | Strict separation: interfaces in `pkg/`, implementations in `internal/` | | **Plugin version mismatch** | Medium | Prefer static registration; document version requirements | | **Database migration conflicts** | Medium | Central migration orchestrator, dependency ordering | | **Performance bottlenecks** | Low | Load testing in Phase 7, profiling, caching strategy | | **Security vulnerabilities** | High | Security audit, gosec scanning, input validation | ### Process Risks | Risk | Impact | Mitigation | |------|--------|------------| | **Scope creep** | Medium | Stick to phased approach, defer optional features to Phase 8 | | **Incomplete documentation** | Medium | Documentation as part of each phase, not afterthought | | **Testing gaps** | High | Test coverage requirements, integration tests early | --- ## Success Criteria The platform is considered complete when: 1. ✅ All core modules are implemented and tested 2. ✅ Blog module serves as working reference 3. ✅ Test coverage >80% 4. ✅ Documentation is complete 5. ✅ CI/CD pipeline is production-ready 6. ✅ Docker images build and run 7. ✅ Integration tests pass 8. ✅ Security audit passes 9. ✅ Performance meets SLA (<100ms p95 for auth) 10. ✅ New developer can set up in <30 minutes --- ## Next Steps After Implementation 1. **Gather Feedback**: Share with team, collect requirements 2. **Iterate**: Add features based on feedback 3. **Scale**: Optimize for production load 4. **Extend**: Add more modules as needed 5. **Community**: Open source (if applicable), gather contributors --- ## References - [playbook.md](./playbook.md) - General platform playbook - [playbook-golang.md](./playbook-golang.md) - Go-specific playbook - [Go Modules Documentation](https://go.dev/doc/modules) - [Ent Documentation](https://entgo.io/docs/getting-started) - [Uber FX Documentation](https://github.com/uber-go/fx) - [OpenTelemetry Go](https://opentelemetry.io/docs/instrumentation/go/) --- **Document Version:** 1.0 **Status:** Ready for Implementation