Transform all documentation from modular monolith to true microservices
architecture where core services are independently deployable.
Key Changes:
- Core Kernel: Infrastructure only (no business logic)
- Core Services: Auth, Identity, Authz, Audit as separate microservices
- Each service has own entry point (cmd/{service}/)
- Each service has own gRPC server and database schema
- Services register with Consul for service discovery
- API Gateway: Moved from Epic 8 to Epic 1 as core infrastructure
- Single entry point for all external traffic
- Handles routing, JWT validation, rate limiting, CORS
- Service Discovery: Consul as primary mechanism (ADR-0033)
- Database Pattern: Per-service connections with schema isolation
Documentation Updates:
- Updated all 9 architecture documents
- Updated 4 ADRs and created 2 new ADRs (API Gateway, Service Discovery)
- Rewrote Epic 1: Core Kernel & Infrastructure (infrastructure only)
- Rewrote Epic 2: Core Services (Auth, Identity, Authz, Audit as services)
- Updated Epic 3-8 stories for service architecture
- Updated plan.md, playbook.md, requirements.md, index.md
- Updated all epic READMEs and story files
New ADRs:
- ADR-0032: API Gateway Strategy
- ADR-0033: Service Discovery Implementation (Consul)
New Stories:
- Epic 1.7: Service Client Interfaces
- Epic 1.8: API Gateway Implementation
404 lines
13 KiB
Markdown
404 lines
13 KiB
Markdown
# Service Orchestration
|
|
|
|
## Purpose
|
|
|
|
This document explains how services work together in the Go Platform's microservices architecture, focusing on service lifecycle management, discovery, communication patterns, and failure handling.
|
|
|
|
## Overview
|
|
|
|
The Go Platform consists of multiple independent services that communicate via service clients (gRPC/HTTP) and share infrastructure components. Services are discovered and registered through a service registry (Consul), enabling dynamic service location and health monitoring.
|
|
|
|
## Key Concepts
|
|
|
|
- **Service**: Independent process providing specific functionality
|
|
- **Service Registry**: Central registry for service discovery (Consul - primary, Kubernetes as alternative)
|
|
- **Service Client**: Abstraction for inter-service communication
|
|
- **Service Discovery**: Process of locating services by name
|
|
- **Service Health**: Health status of a service (healthy, unhealthy, degraded)
|
|
|
|
## Service Lifecycle Management
|
|
|
|
Services follow a well-defined lifecycle from startup to shutdown.
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> Starting: Service starts
|
|
Starting --> Registering: Initialize services
|
|
Registering --> StartingServer: Register with service registry
|
|
StartingServer --> Running: Start HTTP/gRPC servers
|
|
Running --> Healthy: Health checks pass
|
|
Running --> Unhealthy: Health checks fail
|
|
Unhealthy --> Running: Health checks recover
|
|
Healthy --> Degrading: Dependency issues
|
|
Degrading --> Healthy: Dependencies recover
|
|
Degrading --> Unhealthy: Critical failure
|
|
Running --> ShuttingDown: Receive shutdown signal
|
|
ShuttingDown --> Deregistering: Stop accepting requests
|
|
Deregistering --> Stopped: Deregister from registry
|
|
Stopped --> [*]
|
|
```
|
|
|
|
### Lifecycle States
|
|
|
|
1. **Starting**: Service is initializing, loading configuration
|
|
2. **Registering**: Service registers with service registry
|
|
3. **Starting Server**: HTTP and gRPC servers starting
|
|
4. **Running**: Service is running and processing requests
|
|
5. **Healthy**: All health checks passing
|
|
6. **Unhealthy**: Health checks failing
|
|
7. **Degrading**: Service operational but with degraded functionality
|
|
8. **Shutting Down**: Service received shutdown signal
|
|
9. **Deregistering**: Service removing itself from registry
|
|
10. **Stopped**: Service has stopped
|
|
|
|
## Service Discovery and Registration
|
|
|
|
Services automatically register themselves with the service registry on startup and deregister on shutdown.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Service
|
|
participant ServiceRegistry
|
|
participant Registry[Consul<br/>Service Registry]
|
|
participant Client
|
|
|
|
Service->>ServiceRegistry: Register(serviceInfo)
|
|
ServiceRegistry->>Registry: Register service
|
|
Registry->>Registry: Store service info
|
|
Registry-->>ServiceRegistry: Registration confirmed
|
|
ServiceRegistry-->>Service: Service registered
|
|
|
|
Note over Service: Service starts health checks
|
|
|
|
loop Every health check interval
|
|
Service->>ServiceRegistry: Update health status
|
|
ServiceRegistry->>Registry: Update health
|
|
end
|
|
|
|
Client->>ServiceRegistry: Discover(serviceName)
|
|
ServiceRegistry->>Registry: Query services
|
|
Registry-->>ServiceRegistry: Service list
|
|
ServiceRegistry->>ServiceRegistry: Filter healthy services
|
|
ServiceRegistry->>ServiceRegistry: Load balance
|
|
ServiceRegistry-->>Client: Service endpoint
|
|
|
|
Client->>Service: Connect via gRPC/HTTP
|
|
|
|
Service->>ServiceRegistry: Deregister()
|
|
ServiceRegistry->>Registry: Remove service
|
|
Registry-->>ServiceRegistry: Service removed
|
|
```
|
|
|
|
### Service Registration Process
|
|
|
|
1. **Service Startup**: Service initializes and loads configuration
|
|
2. **Service Info Creation**: Create service info with name, version, address, protocol
|
|
3. **Registry Registration**: Register service with Consul (primary) or Kubernetes service discovery (alternative)
|
|
4. **Health Check Setup**: Start health check endpoint
|
|
5. **Health Status Updates**: Periodically update health status in registry
|
|
6. **Service Discovery**: Clients query registry for service endpoints
|
|
7. **Load Balancing**: Registry returns healthy service instances
|
|
8. **Service Deregistration**: On shutdown, remove service from registry
|
|
|
|
## Service Communication Patterns
|
|
|
|
Services communicate through well-defined patterns using service clients.
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Service A"
|
|
ServiceA[Service A Handler]
|
|
ClientA[Service Client]
|
|
end
|
|
|
|
subgraph "Service Registry"
|
|
Registry[Service Registry]
|
|
end
|
|
|
|
subgraph "Service B"
|
|
ServiceB[Service B Handler]
|
|
ServerB[gRPC Server]
|
|
end
|
|
|
|
subgraph "Service C"
|
|
ServiceC[Service C Handler]
|
|
end
|
|
|
|
subgraph "Event Bus"
|
|
EventBus[Event Bus<br/>Kafka]
|
|
end
|
|
|
|
ServiceA -->|Discover| Registry
|
|
Registry -->|Service B endpoint| ClientA
|
|
ClientA -->|gRPC Call| ServerB
|
|
ServerB --> ServiceB
|
|
ServiceB -->|Response| ClientA
|
|
|
|
ServiceA -->|Publish Event| EventBus
|
|
EventBus -->|Subscribe| ServiceC
|
|
ServiceC -->|Process Event| ServiceC
|
|
|
|
style ClientA fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
|
|
style ServerB fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
|
|
style EventBus fill:#7b68ee,stroke:#5a4fcf,stroke-width:2px,color:#fff
|
|
```
|
|
|
|
### Communication Patterns
|
|
|
|
#### Synchronous Communication (gRPC/HTTP)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant ServiceClient
|
|
participant Registry
|
|
participant Service
|
|
|
|
Client->>ServiceClient: Call service method
|
|
ServiceClient->>Registry: Discover service
|
|
Registry-->>ServiceClient: Service endpoint
|
|
ServiceClient->>Service: gRPC/HTTP call
|
|
Service->>Service: Process request
|
|
Service-->>ServiceClient: Response
|
|
ServiceClient-->>Client: Return result
|
|
|
|
alt Service unavailable
|
|
ServiceClient->>Registry: Retry discovery
|
|
Registry-->>ServiceClient: Alternative endpoint
|
|
ServiceClient->>Service: Retry call
|
|
end
|
|
```
|
|
|
|
#### Asynchronous Communication (Event Bus)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Publisher
|
|
participant EventBus
|
|
participant Kafka
|
|
participant Subscriber1
|
|
participant Subscriber2
|
|
|
|
Publisher->>EventBus: Publish event
|
|
EventBus->>Kafka: Send to topic
|
|
Kafka-->>EventBus: Acknowledged
|
|
|
|
Kafka->>Subscriber1: Deliver event
|
|
Kafka->>Subscriber2: Deliver event
|
|
|
|
Subscriber1->>Subscriber1: Process event
|
|
Subscriber2->>Subscriber2: Process event
|
|
|
|
Note over Subscriber1,Subscriber2: Events processed independently
|
|
```
|
|
|
|
## Service Dependency Graph
|
|
|
|
Services have dependencies that determine startup ordering and communication patterns.
|
|
|
|
```mermaid
|
|
graph TD
|
|
subgraph "Core Services"
|
|
Identity[Identity Service]
|
|
Auth[Auth Service]
|
|
Authz[Authz Service]
|
|
Audit[Audit Service]
|
|
end
|
|
|
|
subgraph "Feature Services"
|
|
Blog[Blog Service]
|
|
Billing[Billing Service]
|
|
Analytics[Analytics Service]
|
|
end
|
|
|
|
subgraph "Infrastructure Services"
|
|
Registry[Service Registry]
|
|
EventBus[Event Bus]
|
|
Cache[Cache Service]
|
|
end
|
|
|
|
Auth --> Identity
|
|
Auth --> Registry
|
|
Authz --> Identity
|
|
Authz --> Cache
|
|
Authz --> Audit
|
|
Audit --> Registry
|
|
|
|
Blog --> Authz
|
|
Blog --> Identity
|
|
Blog --> Audit
|
|
Blog --> Registry
|
|
Blog --> EventBus
|
|
Blog --> Cache
|
|
|
|
Billing --> Authz
|
|
Billing --> Identity
|
|
Billing --> Registry
|
|
Billing --> EventBus
|
|
|
|
Analytics --> EventBus
|
|
Analytics --> Registry
|
|
|
|
style Identity fill:#4a90e2,stroke:#2e5c8a,stroke-width:3px,color:#fff
|
|
style Auth fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
|
|
style Blog fill:#7b68ee,stroke:#5a4fcf,stroke-width:2px,color:#fff
|
|
```
|
|
|
|
### Dependency Types
|
|
|
|
1. **Hard Dependencies**: Service cannot start without dependency (e.g., Auth depends on Identity)
|
|
2. **Soft Dependencies**: Service can start but with degraded functionality
|
|
3. **Runtime Dependencies**: Dependencies discovered at runtime via service registry
|
|
|
|
## Service Health and Failure Handling
|
|
|
|
Services continuously report their health status, enabling automatic failure detection and recovery.
|
|
|
|
```mermaid
|
|
graph TD
|
|
Service[Service] --> HealthCheck[Health Check Endpoint]
|
|
HealthCheck --> CheckDB[Check Database]
|
|
HealthCheck --> CheckCache[Check Cache]
|
|
HealthCheck --> CheckDeps[Check Dependencies]
|
|
|
|
CheckDB -->|Healthy| Aggregate[Aggregate Health]
|
|
CheckCache -->|Healthy| Aggregate
|
|
CheckDeps -->|Healthy| Aggregate
|
|
|
|
Aggregate -->|All Healthy| Healthy[Healthy Status]
|
|
Aggregate -->|Degraded| Degraded[Degraded Status]
|
|
Aggregate -->|Unhealthy| Unhealthy[Unhealthy Status]
|
|
|
|
Healthy --> Registry[Update Registry]
|
|
Degraded --> Registry
|
|
Unhealthy --> Registry
|
|
|
|
Registry --> LoadBalancer[Load Balancer]
|
|
LoadBalancer -->|Healthy Only| RouteTraffic[Route Traffic]
|
|
LoadBalancer -->|Unhealthy| NoTraffic[No Traffic]
|
|
|
|
style Healthy fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
|
|
style Degraded fill:#ffa500,stroke:#ff8c00,stroke-width:2px,color:#fff
|
|
style Unhealthy fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
|
|
```
|
|
|
|
### Health Check Types
|
|
|
|
1. **Liveness Check**: Service process is running
|
|
2. **Readiness Check**: Service is ready to accept requests
|
|
3. **Dependency Checks**: Database, cache, and other dependencies are accessible
|
|
4. **Business Health**: Service-specific health indicators
|
|
|
|
### Failure Handling Strategies
|
|
|
|
#### Circuit Breaker Pattern
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> Closed: Service healthy
|
|
Closed --> Open: Failure threshold exceeded
|
|
Open --> HalfOpen: Timeout period
|
|
HalfOpen --> Closed: Success
|
|
HalfOpen --> Open: Failure
|
|
```
|
|
|
|
#### Retry Strategy
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant Service
|
|
|
|
Client->>Service: Request
|
|
Service-->>Client: Failure
|
|
|
|
Client->>Client: Wait (exponential backoff)
|
|
Client->>Service: Retry 1
|
|
Service-->>Client: Failure
|
|
|
|
Client->>Client: Wait (exponential backoff)
|
|
Client->>Service: Retry 2
|
|
Service-->>Client: Success
|
|
```
|
|
|
|
#### Service Degradation
|
|
|
|
When a service dependency fails, the service may continue operating with degraded functionality:
|
|
|
|
- **Cache Unavailable**: Service continues but without caching
|
|
- **Event Bus Unavailable**: Service continues but events are queued
|
|
- **Non-Critical Dependency Fails**: Service continues with reduced features
|
|
|
|
## Service Scaling Scenarios
|
|
|
|
Services can be scaled independently based on load and requirements.
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Load Balancer"
|
|
LB[Load Balancer]
|
|
end
|
|
|
|
subgraph "Service Instances"
|
|
Instance1[Service Instance 1<br/>Healthy]
|
|
Instance2[Service Instance 2<br/>Healthy]
|
|
Instance3[Service Instance 3<br/>Starting]
|
|
Instance4[Service Instance 4<br/>Unhealthy]
|
|
end
|
|
|
|
subgraph "Service Registry"
|
|
Registry[Service Registry]
|
|
end
|
|
|
|
subgraph "Infrastructure"
|
|
DB[(Database)]
|
|
Cache[(Cache)]
|
|
end
|
|
|
|
LB -->|Discover| Registry
|
|
Registry -->|Healthy Instances| LB
|
|
LB --> Instance1
|
|
LB --> Instance2
|
|
LB -.->|No Traffic| Instance3
|
|
LB -.->|No Traffic| Instance4
|
|
|
|
Instance1 --> DB
|
|
Instance2 --> DB
|
|
Instance3 --> DB
|
|
Instance4 --> DB
|
|
|
|
Instance1 --> Cache
|
|
Instance2 --> Cache
|
|
|
|
style Instance1 fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
|
|
style Instance2 fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
|
|
style Instance3 fill:#ffa500,stroke:#ff8c00,stroke-width:2px,color:#fff
|
|
style Instance4 fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
|
|
```
|
|
|
|
### Scaling Patterns
|
|
|
|
1. **Horizontal Scaling**: Add more service instances
|
|
2. **Vertical Scaling**: Increase resources for existing instances
|
|
3. **Auto-Scaling**: Automatically scale based on metrics
|
|
4. **Load-Based Routing**: Route traffic to healthy instances only
|
|
|
|
## Integration Points
|
|
|
|
This service orchestration integrates with:
|
|
|
|
- **[System Behavior Overview](system-behavior.md)**: How services behave during startup and operation
|
|
- **[Module Integration Patterns](module-integration-patterns.md)**: How modules are loaded as services
|
|
- **[Operational Scenarios](operational-scenarios.md)**: Service interaction in specific scenarios
|
|
- **[Architecture Overview](architecture.md)**: Overall system architecture
|
|
|
|
## Related Documentation
|
|
|
|
- [System Behavior Overview](system-behavior.md) - System-level behavior
|
|
- [Module Integration Patterns](module-integration-patterns.md) - Module service integration
|
|
- [Operational Scenarios](operational-scenarios.md) - Service interaction scenarios
|
|
- [Architecture Overview](architecture.md) - System architecture
|
|
- [ADR-0029: Microservices Architecture](../adr/0029-microservices-architecture.md) - Architecture decision
|
|
- [ADR-0030: Service Communication Strategy](../adr/0030-service-communication-strategy.md) - Communication patterns
|
|
|