Transform all documentation from modular monolith to true microservices
architecture where core services are independently deployable.
Key Changes:
- Core Kernel: Infrastructure only (no business logic)
- Core Services: Auth, Identity, Authz, Audit as separate microservices
- Each service has own entry point (cmd/{service}/)
- Each service has own gRPC server and database schema
- Services register with Consul for service discovery
- API Gateway: Moved from Epic 8 to Epic 1 as core infrastructure
- Single entry point for all external traffic
- Handles routing, JWT validation, rate limiting, CORS
- Service Discovery: Consul as primary mechanism (ADR-0033)
- Database Pattern: Per-service connections with schema isolation
Documentation Updates:
- Updated all 9 architecture documents
- Updated 4 ADRs and created 2 new ADRs (API Gateway, Service Discovery)
- Rewrote Epic 1: Core Kernel & Infrastructure (infrastructure only)
- Rewrote Epic 2: Core Services (Auth, Identity, Authz, Audit as services)
- Updated Epic 3-8 stories for service architecture
- Updated plan.md, playbook.md, requirements.md, index.md
- Updated all epic READMEs and story files
New ADRs:
- ADR-0032: API Gateway Strategy
- ADR-0033: Service Discovery Implementation (Consul)
New Stories:
- Epic 1.7: Service Client Interfaces
- Epic 1.8: API Gateway Implementation
15 KiB
System Behavior Overview
Purpose
This document provides a high-level explanation of how the Go Platform behaves end-to-end, focusing on system-level operations, flows, and interactions rather than implementation details.
Overview
The Go Platform is a microservices-based system where each service is independently deployable from day one. Services communicate via gRPC (primary) or HTTP (fallback) through service clients, share infrastructure components (PostgreSQL instance, Redis, Kafka), and are orchestrated through service discovery and dependency injection. All external traffic enters through the API Gateway.
Key Concepts
- Services: Independent processes that can be deployed and scaled separately
- Service Clients: Abstraction layer for inter-service communication
- Service Registry: Central registry for service discovery
- Event Bus: Asynchronous communication channel for events
- DI Container: Dependency injection container managing service lifecycle
Service Bootstrap Sequence
Each service (API Gateway, Auth, Identity, Authz, Audit, and feature services) follows a well-defined startup sequence. Services bootstrap independently.
Individual Service Startup
sequenceDiagram
participant Main
participant Config
participant Logger
participant DI
participant ServiceImpl
participant ServiceRegistry
participant DB
participant HTTP
participant gRPC
Main->>Config: Load configuration
Config-->>Main: Config ready
Main->>Logger: Initialize logger
Logger-->>Main: Logger ready
Main->>DI: Create DI container
DI->>DI: Register core kernel services
DI-->>Main: DI container ready
Main->>ServiceImpl: Register service implementation
ServiceImpl->>DI: Register service dependencies
ServiceImpl->>DB: Connect to database
DB-->>ServiceImpl: Connection ready
Main->>DB: Run migrations
DB-->>Main: Migrations complete
Main->>ServiceRegistry: Register service
ServiceRegistry->>ServiceRegistry: Register with Consul/K8s
ServiceRegistry-->>Main: Service registered
Main->>gRPC: Start gRPC server
Main->>HTTP: Start HTTP server (if needed)
HTTP-->>Main: HTTP server ready
gRPC-->>Main: gRPC server ready
Main->>DI: Start lifecycle
DI->>DI: Execute OnStart hooks
DI-->>Main: Service started
Platform Startup (All Services)
sequenceDiagram
participant Docker
participant Gateway
participant AuthSvc
participant IdentitySvc
participant AuthzSvc
participant AuditSvc
participant BlogSvc
participant Registry
participant DB
Docker->>DB: Start PostgreSQL
Docker->>Registry: Start Consul
DB-->>Docker: Database ready
Registry-->>Docker: Registry ready
par Service Startup (in parallel)
Docker->>Gateway: Start API Gateway
Gateway->>Registry: Register
Gateway->>Gateway: Start HTTP server
Gateway-->>Docker: Gateway ready
and
Docker->>AuthSvc: Start Auth Service
AuthSvc->>DB: Connect
AuthSvc->>Registry: Register
AuthSvc->>AuthSvc: Start gRPC server
AuthSvc-->>Docker: Auth Service ready
and
Docker->>IdentitySvc: Start Identity Service
IdentitySvc->>DB: Connect
IdentitySvc->>Registry: Register
IdentitySvc->>IdentitySvc: Start gRPC server
IdentitySvc-->>Docker: Identity Service ready
and
Docker->>AuthzSvc: Start Authz Service
AuthzSvc->>DB: Connect
AuthzSvc->>Registry: Register
AuthzSvc->>AuthzSvc: Start gRPC server
AuthzSvc-->>Docker: Authz Service ready
and
Docker->>AuditSvc: Start Audit Service
AuditSvc->>DB: Connect
AuditSvc->>Registry: Register
AuditSvc->>AuditSvc: Start gRPC server
AuditSvc-->>Docker: Audit Service ready
and
Docker->>BlogSvc: Start Blog Service
BlogSvc->>DB: Connect
BlogSvc->>Registry: Register
BlogSvc->>BlogSvc: Start gRPC server
BlogSvc-->>Docker: Blog Service ready
end
Docker->>Docker: All services ready
Service Bootstrap Phases (Per Service)
- Configuration Loading: Load YAML files, environment variables, and secrets
- Foundation Services: Initialize core kernel (logger, config, DI container)
- Database Connection: Connect to database with own connection pool
- Service Implementation: Register service-specific implementations
- Database Migrations: Run service-specific migrations
- Service Registration: Register service with service registry
- Server Startup: Start gRPC server (and HTTP if needed)
- Lifecycle Hooks: Execute OnStart hooks
Platform Startup Order
- Infrastructure: Start PostgreSQL, Redis, Kafka, Consul
- Core Services: Start Auth, Identity, Authz, Audit services (can start in parallel)
- API Gateway: Start API Gateway (depends on service registry)
- Feature Services: Start Blog, Billing, etc. (can start in parallel)
- Health Checks: All services report healthy to registry
Request Processing Pipeline
Every HTTP request flows through API Gateway first, then to backend services. The pipeline ensures security, observability, and proper error handling.
graph TD
Start([HTTP Request]) --> Gateway[API Gateway]
Gateway --> RateLimit[Rate Limiting]
RateLimit -->|Allowed| Auth[Validate JWT via Auth Service]
RateLimit -->|Exceeded| Error0[429 Too Many Requests]
Auth -->|Valid Token| Authz[Check Permission via Authz Service]
Auth -->|Invalid Token| Error1[401 Unauthorized]
Authz -->|Authorized| RateLimit[Rate Limiting]
Authz -->|Unauthorized| Error2[403 Forbidden]
RateLimit -->|Within Limits| Tracing[OpenTelemetry Tracing]
RateLimit -->|Rate Limited| Error3[429 Too Many Requests]
Tracing --> Handler[Request Handler]
Handler --> Service[Domain Service]
Service --> Cache{Cache Check}
Cache -->|Hit| Return[Return Cached Data]
Cache -->|Miss| Repo[Repository]
Repo --> DB[(Database)]
DB --> Repo
Repo --> Service
Service --> CacheStore[Update Cache]
Service --> EventBus[Publish Events]
Service --> Audit[Audit Logging]
Service --> Metrics[Update Metrics]
Service --> Handler
Handler --> Tracing
Tracing --> Response[HTTP Response]
Error1 --> Response
Error2 --> Response
Error3 --> Response
Return --> Response
style Auth fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
style Authz fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
style Service fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
Request Processing Stages
- Authentication: Extract and validate JWT token, add user to context
- Authorization: Check user permissions for requested resource
- Rate Limiting: Enforce per-user and per-IP rate limits
- Tracing: Start/continue distributed trace
- Handler Processing: Execute request handler
- Service Logic: Execute business logic
- Data Access: Query database or cache
- Side Effects: Publish events, audit logs, update metrics
- Response: Return HTTP response with tracing context
Event-Driven Interactions
The platform uses an event bus for asynchronous communication between services, enabling loose coupling and scalability.
sequenceDiagram
participant Publisher
participant EventBus
participant Kafka
participant Subscriber1
participant Subscriber2
Publisher->>EventBus: Publish(event)
EventBus->>EventBus: Serialize event
EventBus->>EventBus: Add metadata (trace_id, user_id)
EventBus->>Kafka: Send to topic
Kafka-->>EventBus: Acknowledged
Kafka->>Subscriber1: Deliver event
Kafka->>Subscriber2: Deliver event
Subscriber1->>Subscriber1: Process event
Subscriber1->>Subscriber1: Update state
Subscriber1->>Subscriber1: Emit new events (optional)
Subscriber2->>Subscriber2: Process event
Subscriber2->>Subscriber2: Update state
Note over Subscriber1,Subscriber2: Events processed asynchronously
Event Processing Flow
- Event Publishing: Service publishes event to event bus
- Event Serialization: Event is serialized with metadata
- Event Distribution: Event bus distributes to Kafka topic
- Event Consumption: Subscribers consume events from Kafka
- Event Processing: Each subscriber processes event independently
- State Updates: Subscribers update their own state
- Cascade Events: Subscribers may publish new events
Background Job Processing
Background jobs are scheduled and processed asynchronously, enabling long-running tasks and scheduled operations.
sequenceDiagram
participant Scheduler
participant JobQueue
participant Worker
participant Service
participant DB
participant EventBus
Scheduler->>JobQueue: Enqueue job
JobQueue->>JobQueue: Store job definition
Worker->>JobQueue: Poll for jobs
JobQueue-->>Worker: Job definition
Worker->>Worker: Start job execution
Worker->>Service: Execute job logic
Service->>DB: Update data
Service->>EventBus: Publish events
Service-->>Worker: Job complete
Worker->>JobQueue: Mark job complete
alt Job fails
Worker->>JobQueue: Mark job failed
JobQueue->>JobQueue: Schedule retry
end
Background Job Flow
- Job Scheduling: Jobs scheduled via cron or programmatically
- Job Enqueueing: Job definition stored in job queue
- Job Polling: Workers poll queue for available jobs
- Job Execution: Worker executes job logic
- Job Completion: Job marked as complete or failed
- Job Retry: Failed jobs retried with exponential backoff
Error Recovery and Resilience
The platform implements multiple layers of error handling to ensure system resilience.
graph TD
Error[Error Occurs] --> Handler{Error Handler}
Handler -->|Business Error| BusinessError[Business Error Handler]
Handler -->|System Error| SystemError[System Error Handler]
Handler -->|Panic| PanicHandler[Panic Recovery]
BusinessError --> ErrorBus[Error Bus]
SystemError --> ErrorBus
PanicHandler --> ErrorBus
ErrorBus --> Logger[Logger]
ErrorBus --> Sentry[Sentry]
ErrorBus --> Metrics[Metrics]
BusinessError --> Response[HTTP Response]
SystemError --> Response
PanicHandler --> Response
Response --> Client[Client]
style Error fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
style ErrorBus fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
Error Handling Layers
- Panic Recovery: Middleware catches panics and prevents crashes
- Error Classification: Errors classified as business or system errors
- Error Bus: Central error bus collects all errors
- Error Logging: Errors logged with full context
- Error Reporting: Critical errors reported to Sentry
- Error Metrics: Errors tracked in metrics
- Error Response: Appropriate HTTP response returned
System Shutdown Sequence
The platform implements graceful shutdown to ensure data consistency and proper resource cleanup.
sequenceDiagram
participant Signal
participant Main
participant HTTP
participant gRPC
participant ServiceRegistry
participant DI
participant Workers
participant DB
Signal->>Main: SIGTERM/SIGINT
Main->>HTTP: Stop accepting requests
HTTP->>HTTP: Wait for active requests
HTTP-->>Main: HTTP server stopped
Main->>gRPC: Stop accepting connections
gRPC->>gRPC: Wait for active calls
gRPC-->>Main: gRPC server stopped
Main->>ServiceRegistry: Deregister service
ServiceRegistry->>ServiceRegistry: Remove from registry
ServiceRegistry-->>Main: Service deregistered
Main->>Workers: Stop workers
Workers->>Workers: Finish current jobs
Workers-->>Main: Workers stopped
Main->>DI: Stop lifecycle
DI->>DI: Execute OnStop hooks
DI->>DI: Close connections
DI->>DB: Close DB connections
DI-->>Main: Services stopped
Main->>Main: Exit
Shutdown Phases
- Signal Reception: Receive SIGTERM or SIGINT
- Stop Accepting Requests: HTTP and gRPC servers stop accepting new requests
- Wait for Active Requests: Wait for in-flight requests to complete
- Service Deregistration: Remove service from service registry
- Worker Shutdown: Stop background workers gracefully
- Lifecycle Hooks: Execute OnStop hooks for all services
- Resource Cleanup: Close database connections, release resources
- Application Exit: Exit application cleanly
Health Check and Monitoring Flow
Health checks and metrics provide visibility into system health and performance.
graph TD
HealthEndpoint["/healthz"] --> HealthRegistry[Health Registry]
HealthRegistry --> CheckDB[Check Database]
HealthRegistry --> CheckCache[Check Cache]
HealthRegistry --> CheckEventBus[Check Event Bus]
CheckDB -->|Healthy| Aggregate[Aggregate Results]
CheckCache -->|Healthy| Aggregate
CheckEventBus -->|Healthy| Aggregate
Aggregate -->|All Healthy| Response200[200 OK]
Aggregate -->|Unhealthy| Response503[503 Service Unavailable]
MetricsEndpoint["/metrics"] --> MetricsRegistry[Metrics Registry]
MetricsRegistry --> Prometheus[Prometheus Format]
Prometheus --> ResponseMetrics[Metrics Response]
style HealthRegistry fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
style MetricsRegistry fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
Health Check Components
- Liveness Check: Service is running (process health)
- Readiness Check: Service is ready to accept requests (dependency health)
- Dependency Checks: Database, cache, event bus connectivity
- Metrics Collection: Request counts, durations, error rates
- Metrics Export: Prometheus-formatted metrics
Integration Points
This system behavior integrates with:
- Service Orchestration: How services coordinate during startup and operation
- Module Integration Patterns: How modules integrate during bootstrap
- Operational Scenarios: Specific operational flows and use cases
- Data Flow Patterns: Detailed data flow through the system
- Architecture Overview: System architecture and component relationships
Related Documentation
- Architecture Overview - System architecture
- Service Orchestration - Service coordination
- Module Integration Patterns - Module integration
- Operational Scenarios - Common operational flows
- Component Relationships - Component dependencies