docs: Align documentation with true microservices architecture

Transform all documentation from modular monolith to true microservices
architecture where core services are independently deployable.

Key Changes:
- Core Kernel: Infrastructure only (no business logic)
- Core Services: Auth, Identity, Authz, Audit as separate microservices
  - Each service has own entry point (cmd/{service}/)
  - Each service has own gRPC server and database schema
  - Services register with Consul for service discovery
- API Gateway: Moved from Epic 8 to Epic 1 as core infrastructure
  - Single entry point for all external traffic
  - Handles routing, JWT validation, rate limiting, CORS
- Service Discovery: Consul as primary mechanism (ADR-0033)
- Database Pattern: Per-service connections with schema isolation

Documentation Updates:
- Updated all 9 architecture documents
- Updated 4 ADRs and created 2 new ADRs (API Gateway, Service Discovery)
- Rewrote Epic 1: Core Kernel & Infrastructure (infrastructure only)
- Rewrote Epic 2: Core Services (Auth, Identity, Authz, Audit as services)
- Updated Epic 3-8 stories for service architecture
- Updated plan.md, playbook.md, requirements.md, index.md
- Updated all epic READMEs and story files

New ADRs:
- ADR-0032: API Gateway Strategy
- ADR-0033: Service Discovery Implementation (Consul)

New Stories:
- Epic 1.7: Service Client Interfaces
- Epic 1.8: API Gateway Implementation
This commit is contained in:
2025-11-06 08:47:27 +01:00
parent cab7cadf9e
commit 38a251968c
47 changed files with 3190 additions and 1613 deletions

View File

@@ -15,9 +15,13 @@ This plan breaks down the implementation into **8 epics**, each with specific de
**Key Principles:**
- **Hexagonal Architecture** with clear separation between `pkg/` (interfaces) and `internal/` (implementations)
- **Dependency Injection** using `uber-go/fx` for lifecycle management
- **Microservices Architecture** - each module is an independent service from day one
- **microMicroservices Architecture** - each service is independently deployable from day one
- Core Kernel: Infrastructure only (config, logger, DI, health, metrics, observability)
- Core Services: Auth, Identity, Authz, Audit as separate microservices
- API Gateway: Single entry point for all external traffic (Epic 1)
- **Service Client Interfaces** - all inter-service communication via gRPC/HTTP
- **Service Discovery** - all services register and discover via service registry
- **Database Isolation** - each service has its own database connection pool and schema
- **Plugin-first** architecture supporting both static and dynamic module loading
- **Security-by-Design** with JWT auth, RBAC/ABAC, and audit logging
- **Observability** via OpenTelemetry, Prometheus, and structured logging
@@ -171,69 +175,71 @@ This plan breaks down the implementation into **8 epics**, each with specific de
## Epic 1: Core Kernel & Infrastructure (Week 2-3)
### Objectives
- Extend DI container to support all core services
- Implement database layer with Ent ORM
- Extend DI container to support core kernel infrastructure only (no business services)
- Implement API Gateway as core infrastructure component
- Create service client interfaces for microservices architecture
- Build health monitoring and metrics system
- Create error handling and error bus
- Establish HTTP server with comprehensive middleware stack
- Establish HTTP/gRPC server foundation for services
- Integrate OpenTelemetry for distributed tracing
- Create service client interfaces for microservices architecture
- Basic service registry implementation
**Note:** This epic focuses on infrastructure only. Business services (Auth, Identity, Authz, Audit) are implemented in Epic 2 as separate microservices.
### Stories
#### 1.1 Enhanced Dependency Injection Container
**Goal:** Extend the DI container to provide all core infrastructure services with proper lifecycle management.
**Goal:** Extend the DI container to provide core kernel infrastructure services only (no business logic services).
**Deliverables:**
- Extended `internal/di/container.go` with:
- Registration of all core services
- Registration of core kernel services only
- Lifecycle management via FX
- Service override support for testing
- `internal/di/providers.go` with provider functions:
- `ProvideConfig()` - configuration provider
- `ProvideLogger()` - logger service
- `ProvideDatabase()` - Ent database client
- `ProvideHealthCheckers()` - health check registry
- `ProvideMetrics()` - Prometheus metrics registry
- `ProvideErrorBus()` - error bus service
- `internal/di/core_module.go` exporting `CoreModule` fx.Option that provides all core services
- `ProvideTracer()` - OpenTelemetry tracer
- `ProvideServiceRegistry()` - service registry interface
- `internal/di/core_module.go` exporting `CoreModule` fx.Option that provides all core kernel services
**Note:** Database, Auth, Identity, Authz, Audit are NOT in core kernel - they are separate services implemented in Epic 2.
**Acceptance Criteria:**
- All core services are provided via DI container
- All core kernel services are provided via DI container
- Services are initialized in correct dependency order
- Lifecycle hooks work for all services
- Services can be overridden for testing
- DI container compiles without errors
- No business logic services in core kernel
#### 1.2 Database Layer with Ent ORM
**Goal:** Set up a complete database layer using Ent ORM with core domain entities, migrations, and connection management.
#### 1.2 Database Client Foundation
**Goal:** Set up database client foundation for services. Each service will have its own database connection and schema.
**Deliverables:**
- Ent schema initialization and core entities:
- `User` entity: ID, email, password_hash, verified, created_at, updated_at
- `Role` entity: ID, name, description, created_at
- `Permission` entity: ID, name (format: "module.resource.action")
- `AuditLog` entity: ID, actor_id, action, target_id, metadata (JSON), timestamp
- Many-to-many relationships: `role_permissions` and `user_roles`
- Generated Ent code with proper type safety
- Database client in `internal/infra/database/client.go`:
- `NewEntClient(dsn string) (*ent.Client, error)`
- Database client wrapper in `internal/infra/database/client.go`:
- `NewEntClient(dsn string, schema string) (*ent.Client, error)` - supports schema isolation
- Connection pooling configuration (max connections, idle timeout)
- Migration runner wrapper
- Database health check integration
- Per-service connection pool management
- Database configuration in `config/default.yaml` with:
- Connection string (DSN)
- Connection pool settings
- Migration settings
- Connection pool settings per service
- Schema isolation configuration
- Database client factory for creating service-specific clients
**Note:** Core domain entities (User, Role, Permission, AuditLog) are implemented in Epic 2 as part of their respective services (Identity, Authz, Audit).
**Acceptance Criteria:**
- Ent schema compiles and generates code successfully
- Database client connects to PostgreSQL
- Core entities can be created and queried
- Migrations run successfully on startup
- Database client connects to PostgreSQL with schema support
- Connection pooling is configured correctly
- Database health check works
- All entities have proper indexes and relationships
- Multiple services can connect to same database instance with different schemas
- Each service manages its own connection pool
#### 1.3 Health Monitoring and Metrics System
**Goal:** Implement comprehensive health checks and Prometheus metrics for monitoring platform health and performance.
@@ -289,13 +295,13 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- Error context (request ID, user ID) is preserved
- Background error consumer works correctly
#### 1.5 HTTP Server Foundation with Middleware Stack
**Goal:** Create a production-ready HTTP server with comprehensive middleware for security, observability, and error handling.
#### 1.5 HTTP/gRPC Server Foundation
**Goal:** Create HTTP and gRPC server foundation that services can use. Each service will have its own server instance.
**Deliverables:**
- HTTP server in `internal/server/server.go`:
- Gin router initialization
- Comprehensive middleware stack:
- HTTP server foundation in `internal/server/http.go`:
- Gin router initialization helper
- Common middleware stack:
- Request ID generator (unique per request)
- Structured logging middleware (logs all requests)
- Panic recovery → error bus
@@ -303,28 +309,54 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- CORS support (configurable)
- Request timeout handling
- Response compression
- Core route registration:
- `GET /healthz` - liveness probe
- `GET /ready` - readiness probe
- `GET /metrics` - Prometheus metrics
- FX lifecycle integration:
- HTTP server starts on `OnStart` hook
- Graceful shutdown on `OnStop` hook (drains connections)
- Port configuration from config
- Integration with main application entry point
- Server lifecycle management
- gRPC server foundation in `internal/server/grpc.go`:
- gRPC server initialization
- Interceptor support (logging, tracing, metrics)
- Lifecycle management
- FX lifecycle integration for both HTTP and gRPC servers
**Note:** Services (Auth, Identity, etc.) will use these foundations to create their own server instances in Epic 2.
**Acceptance Criteria:**
- HTTP server starts successfully
- HTTP server foundation is reusable by services
- gRPC server foundation is reusable by services
- All middleware executes in correct order
- Request IDs are generated and logged
- Metrics are collected for all requests
- Panics are recovered and handled
- Graceful shutdown works correctly
- Server is configurable via config system
- CORS is configurable per environment
- Servers are configurable via config system
#### 1.6 OpenTelemetry Distributed Tracing
**Goal:** Integrate OpenTelemetry for distributed tracing across the platform to enable observability in production.
**Goal:** Integrate OpenTelemetry for distributed tracing across all services to enable observability in production.
**Deliverables:**
- OpenTelemetry setup in `internal/observability/tracer.go`:
- TracerProvider initialization
- Export to stdout (development mode)
- Export to OTLP collector (production mode)
- Trace context propagation
- HTTP instrumentation middleware:
- Automatic span creation for HTTP requests
- Trace context propagation via headers
- Span attributes (method, path, status code, etc.)
- gRPC instrumentation:
- gRPC interceptor for automatic span creation
- Trace context propagation via gRPC metadata
- Database instrumentation:
- Ent interceptor for database queries
- Query spans with timing and parameters
- Integration with logger (include trace ID in logs)
**Acceptance Criteria:**
- HTTP requests create OpenTelemetry spans
- gRPC calls create OpenTelemetry spans
- Database queries are traced
- Trace context propagates across service boundaries
- Trace IDs are included in logs
- Traces export correctly to configured backend
- Tracing works in both development and production modes
- Tracing has minimal performance impact
#### 1.7 Service Client Interfaces
**Goal:** Create service client interfaces for all core services to enable microservices communication.
@@ -334,7 +366,6 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- `IdentityServiceClient` - User and identity operations
- `AuthServiceClient` - Authentication operations
- `AuthzServiceClient` - Authorization operations
- `PermissionServiceClient` - Permission resolution
- `AuditServiceClient` - Audit logging
- Service client factory in `internal/services/factory.go`:
- Create gRPC clients (primary)
@@ -351,69 +382,89 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- Configuration supports protocol selection
- All inter-service communication goes through service clients
#### 1.8 API Gateway Implementation
**Goal:** Implement API Gateway as core infrastructure component that routes all external traffic to backend services.
**Deliverables:**
- OpenTelemetry setup in `internal/observability/tracer.go`:
- TracerProvider initialization
- Export to stdout (development mode)
- Export to OTLP collector (production mode)
- Trace context propagation
- HTTP instrumentation middleware:
- Automatic span creation for HTTP requests
- Trace context propagation via headers
- Span attributes (method, path, status code, etc.)
- Database instrumentation:
- Ent interceptor for database queries
- Query spans with timing and parameters
- Integration with logger (include trace ID in logs)
- API Gateway service entry point: `cmd/api-gateway/main.go`
- Gateway implementation in `services/gateway/internal/`:
- Request routing to backend services via service discovery
- JWT token validation via Auth Service client
- Permission checking via Authz Service client (optional, for route-level auth)
- Rate limiting middleware (per-user and per-IP)
- CORS support
- Request/response transformation
- Load balancing across service instances
- Gateway configuration in `config/default.yaml`:
- Route definitions (path → service mapping)
- Rate limiting configuration
- CORS configuration
- Integration with service registry for service discovery
- Health check endpoint for gateway
**Acceptance Criteria:**
- HTTP requests create OpenTelemetry spans
- Database queries are traced
- Trace context propagates across service boundaries
- Trace IDs are included in logs
- Traces export correctly to configured backend
- Tracing works in both development and production modes
- Tracing has minimal performance impact
- API Gateway routes requests to backend services correctly
- JWT validation works via Auth Service
- Rate limiting works correctly
- CORS is configurable and works
- Service discovery integration works
- Gateway is independently deployable
- Gateway has health check endpoint
- All external traffic goes through gateway
### Deliverables
- ✅ DI container with all core services
- ✅ Database client with Ent schema
- ✅ DI container with core kernel services only
- ✅ Database client foundation (per-service connections)
- ✅ Health and metrics endpoints functional
- ✅ Error bus captures and logs errors
- ✅ HTTP server with middleware stack
- ✅ HTTP/gRPC server foundation for services
- ✅ Basic observability with OpenTelemetry
- ✅ Service client interfaces for microservices
- ✅ API Gateway service (core infrastructure)
- ✅ Basic service registry implementation
### Acceptance Criteria
- `GET /healthz` returns 200
- `GET /ready` checks DB connectivity
- `GET /healthz` returns 200 for all services
- `GET /ready` checks service health
- `GET /metrics` exposes Prometheus metrics
- Panic recovery logs errors via error bus
- Database migrations run on startup
- HTTP requests are traced with OpenTelemetry
- HTTP/gRPC requests are traced with OpenTelemetry
- API Gateway routes requests to backend services
- Service client interfaces are defined
- No business logic services in Epic 1
---
## Epic 2: Authentication & Authorization (Week 3-4)
## Epic 2: Core Services (Authentication & Authorization) (Week 3-5)
### Objectives
- Implement complete JWT-based authentication system
- Build comprehensive identity management with user lifecycle
- Create role-based access control (RBAC) system
- Implement authorization middleware and permission checks
- Separate Auth, Identity, Authz, and Audit into independent microservices
- Each service has its own entry point, database connection, and gRPC server
- Implement complete JWT-based authentication system (Auth Service)
- Build comprehensive identity management with user lifecycle (Identity Service)
- Create role-based access control (RBAC) system (Authz Service)
- Implement audit logging system (Audit Service)
- All services communicate via service clients (gRPC/HTTP)
- All services register with service registry
**Note:** This epic transforms the monolithic core into separate, independently deployable services.
- Add comprehensive audit logging for security compliance
- Provide database seeding for initial setup
### Stories
#### 2.1 JWT Authentication System
**Goal:** Implement a complete JWT-based authentication system with access tokens, refresh tokens, and secure token management.
#### 2.1 Auth Service - JWT Authentication
**Goal:** Implement Auth Service as an independent microservice with complete JWT-based authentication system, access tokens, refresh tokens, and secure token management.
**Deliverables:**
- Authentication interfaces in `pkg/auth/auth.go`:
- `Authenticator` interface for token generation and verification
- `TokenClaims` struct with user ID, roles, tenant ID, expiration
- JWT implementation in `internal/auth/jwt_auth.go`:
- Auth Service entry point: `cmd/auth-service/main.go`
- Service implementation in `services/auth/internal/`:
- gRPC server for Auth Service
- HTTP endpoints (optional, for compatibility)
- Authentication interfaces in `pkg/services/auth.go`:
- `AuthServiceClient` interface for authentication operations
- Service client implementation (gRPC/HTTP)
- JWT implementation in `services/auth/internal/jwt_auth.go`:
- Generate short-lived access tokens (15 minutes)
- Generate long-lived refresh tokens (7 days)
- Token signature verification
@@ -424,29 +475,41 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- Verify token validity
- Inject authenticated user into request context
- Helper function: `auth.FromContext(ctx) *User`
- Authentication endpoints:
- `POST /api/v1/auth/login` - Authenticate user and return tokens
- `POST /api/v1/auth/refresh` - Refresh access token using refresh token
- Password validation against stored hashes
- Integration with DI container and HTTP server
- gRPC service definition: `services/auth/api/auth.proto`
- Authentication endpoints (gRPC):
- `Login(email, password)` - Authenticate user and return tokens
- `RefreshToken(refresh_token)` - Refresh access token using refresh token
- `ValidateToken(token)` - Validate JWT token (used by API Gateway)
- Password validation against stored hashes (via Identity Service)
- Database connection: Own connection pool and schema (`auth_schema`)
- Service registration: Register with service registry
- Integration with Identity Service: Use `IdentityServiceClient` for user lookup
**Acceptance Criteria:**
- Users can login and receive access and refresh tokens
- Auth Service is independently deployable
- Service has own entry point (`cmd/auth-service/`)
- gRPC server starts and serves authentication requests
- Users can login via gRPC and receive access and refresh tokens
- Access tokens expire after configured duration
- Refresh tokens can be used to obtain new access tokens
- Token validation works (used by API Gateway)
- Invalid tokens are rejected with appropriate errors
- Authenticated user is available in request context
- Login attempts are logged
- Token secrets are configurable
- Service registers with service registry
- Service uses Identity Service client for user lookup
- Service has own database connection and schema
#### 2.2 Identity Management System
**Goal:** Build a complete user identity management system with registration, email verification, password management, and user CRUD operations.
#### 2.2 Identity Service - User Management
**Goal:** Implement Identity Service as an independent microservice with complete user identity management, registration, email verification, password management, and user CRUD operations.
**Deliverables:**
- Identity interfaces in `pkg/identity/identity.go`:
- `UserRepository` interface for user data access
- `UserService` interface for user business logic
- User repository implementation in `internal/identity/user_repo.go`:
- Identity Service entry point: `cmd/identity-service/main.go`
- Service implementation in `services/identity/internal/`:
- gRPC server for Identity Service
- HTTP endpoints (optional, for compatibility)
- Identity interfaces in `pkg/services/identity.go`:
- `IdentityServiceClient` interface for user operations
- Service client implementation (gRPC/HTTP)
- User repository implementation in `services/identity/internal/user_repo.go`:
- CRUD operations using Ent
- Password hashing (bcrypt or argon2)
- Email uniqueness validation
@@ -457,24 +520,34 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- Password reset flow (token-based, time-limited)
- Password change with old password verification
- User profile updates
- User management API endpoints:
- `POST /api/v1/users` - Register new user
- `GET /api/v1/users/:id` - Get user profile (authorized)
- `PUT /api/v1/users/:id` - Update user profile (authorized)
- `DELETE /api/v1/users/:id` - Delete user (admin only)
- `POST /api/v1/users/verify-email` - Verify email with token
- `POST /api/v1/users/reset-password` - Request password reset
- `POST /api/v1/users/change-password` - Change password
- Integration with email notification system (Epic 5)
- gRPC service definition: `services/identity/api/identity.proto`
- User management endpoints (gRPC):
- `CreateUser(user)` - Register new user
- `GetUser(id)` - Get user profile
- `UpdateUser(id, user)` - Update user profile
- `DeleteUser(id)` - Delete user (admin only)
- `GetUserByEmail(email)` - Get user by email
- `VerifyEmail(token)` - Verify email with token
- `RequestPasswordReset(email)` - Request password reset
- `ResetPassword(token, new_password)` - Reset password
- `ChangePassword(user_id, old_password, new_password)` - Change password
- Database connection: Own connection pool and schema (`identity_schema`)
- Ent schema: User entity in `services/identity/ent/schema/user.go`
- Service registration: Register with service registry
- Integration with email notification system (Epic 5) via event bus
**Acceptance Criteria:**
- Users can register with email and password
- Identity Service is independently deployable
- Service has own entry point (`cmd/identity-service/`)
- gRPC server starts and serves user management requests
- Users can register via gRPC with email and password
- Passwords are securely hashed
- Email verification tokens are generated and validated
- Password reset flow works end-to-end
- Users can update their profiles
- User operations require proper authentication
- All user actions are audited
- Users can update their profiles via gRPC
- Service registers with service registry
- Service has own database connection and schema
- User entity is properly defined in Ent schema
#### 2.3 Role-Based Access Control (RBAC) System
**Goal:** Implement a complete RBAC system with permissions, role management, and authorization middleware.
@@ -1449,11 +1522,13 @@ This plan breaks down the implementation into **8 epics**, each with specific de
## Epic 8: Advanced Features & Polish (Week 9-10, Optional)
### Objectives
- Add advanced features (OIDC, GraphQL, API Gateway)
- Add advanced features (OIDC, GraphQL)
- Performance optimization
- Additional sample modules
- Additional sample feature services
- Final polish and bug fixes
**Note:** API Gateway is now in Epic 1 (Story 1.8) as core infrastructure, not an advanced feature.
### Tasks
#### 8.1 OpenID Connect (OIDC) Support
@@ -1479,30 +1554,26 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- Add authorization checks
- [ ] Add GraphQL endpoint: `POST /graphql`
#### 8.3 API Gateway Features
- [ ] Add request/response transformation
- [ ] Add API key authentication
- [ ] Add request routing rules
- [ ] Add API versioning support
#### 8.4 Additional Sample Modules
- [ ] Create `modules/notification/`:
#### 8.3 Additional Sample Feature Services
- [ ] Create Notification Service (`cmd/notification-service/`):
- Service entry point, gRPC server
- Email templates
- Notification preferences
- Notification history
- [ ] Create `modules/analytics/`:
- [ ] Create Analytics Service (`cmd/analytics-service/`):
- Service entry point, gRPC server
- Event tracking
- Analytics dashboard API
- Export functionality
#### 8.5 Performance Optimization
#### 8.4 Performance Optimization
- [ ] Add database query caching
- [ ] Optimize N+1 queries
- [ ] Add response caching (Redis)
- [ ] Implement connection pooling optimizations
- [ ] Add database read replicas support
#### 8.6 Internationalization (i18n)
#### 8.5 Internationalization (i18n)
- [ ] Install i18n library
- [ ] Add locale detection:
- From Accept-Language header
@@ -1510,7 +1581,7 @@ This plan breaks down the implementation into **8 epics**, each with specific de
- [ ] Create message catalogs
- [ ] Add translation support for error messages
#### 8.7 Final Polish
#### 8.6 Final Polish
- [ ] Code review and refactoring
- [ ] Bug fixes
- [ ] Performance profiling
@@ -1520,7 +1591,7 @@ This plan breaks down the implementation into **8 epics**, each with specific de
### Deliverables
- ✅ OIDC support (optional)
- ✅ GraphQL API (optional)
- ✅ Additional sample modules
- ✅ Additional sample feature services (Notification, Analytics)
- ✅ Performance optimizations
- ✅ Final polish