# Service Orchestration

## Purpose

This document explains how services work together in the Go Platform's microservices architecture, focusing on service lifecycle management, discovery, communication patterns, and failure handling.

## Overview

The Go Platform consists of multiple independent services that communicate via service clients (gRPC/HTTP) and share infrastructure components. Services are discovered and registered through a service registry (Consul), enabling dynamic service location and health monitoring.

## Key Concepts

- **Service**: Independent process providing specific functionality
- **Service Registry**: Central registry for service discovery (Consul - primary, Kubernetes as alternative)
- **Service Client**: Abstraction for inter-service communication
- **Service Discovery**: Process of locating services by name
- **Service Health**: Health status of a service (healthy, unhealthy, degraded)

## Service Lifecycle Management

Services follow a well-defined lifecycle from startup to shutdown.

```mermaid
stateDiagram-v2
    [*] --> Starting: Service starts
    Starting --> Registering: Initialize services
    Registering --> StartingServer: Register with service registry
    StartingServer --> Running: Start HTTP/gRPC servers
    Running --> Healthy: Health checks pass
    Running --> Unhealthy: Health checks fail
    Unhealthy --> Running: Health checks recover
    Healthy --> Degrading: Dependency issues
    Degrading --> Healthy: Dependencies recover
    Degrading --> Unhealthy: Critical failure
    Running --> ShuttingDown: Receive shutdown signal
    ShuttingDown --> Deregistering: Stop accepting requests
    Deregistering --> Stopped: Deregister from registry
    Stopped --> [*]
```

### Lifecycle States

1. **Starting**: Service is initializing, loading configuration
2. **Registering**: Service registers with service registry
3. **Starting Server**: HTTP and gRPC servers starting
4. **Running**: Service is running and processing requests
5. **Healthy**: All health checks passing
6. **Unhealthy**: Health checks failing
7. **Degrading**: Service operational but with degraded functionality
8. **Shutting Down**: Service received shutdown signal
9. **Deregistering**: Service removing itself from registry
10. **Stopped**: Service has stopped

## Service Discovery and Registration

Services automatically register themselves with the service registry on startup and deregister on shutdown.

```mermaid
sequenceDiagram
    participant Service
    participant ServiceRegistry
    participant Registry[Consul<br/>Service Registry]
    participant Client
    
    Service->>ServiceRegistry: Register(serviceInfo)
    ServiceRegistry->>Registry: Register service
    Registry->>Registry: Store service info
    Registry-->>ServiceRegistry: Registration confirmed
    ServiceRegistry-->>Service: Service registered
    
    Note over Service: Service starts health checks
    
    loop Every health check interval
        Service->>ServiceRegistry: Update health status
        ServiceRegistry->>Registry: Update health
    end
    
    Client->>ServiceRegistry: Discover(serviceName)
    ServiceRegistry->>Registry: Query services
    Registry-->>ServiceRegistry: Service list
    ServiceRegistry->>ServiceRegistry: Filter healthy services
    ServiceRegistry->>ServiceRegistry: Load balance
    ServiceRegistry-->>Client: Service endpoint
    
    Client->>Service: Connect via gRPC/HTTP
    
    Service->>ServiceRegistry: Deregister()
    ServiceRegistry->>Registry: Remove service
    Registry-->>ServiceRegistry: Service removed
```

### Service Registration Process

1. **Service Startup**: Service initializes and loads configuration
2. **Service Info Creation**: Create service info with name, version, address, protocol
3. **Registry Registration**: Register service with Consul (primary) or Kubernetes service discovery (alternative)
4. **Health Check Setup**: Start health check endpoint
5. **Health Status Updates**: Periodically update health status in registry
6. **Service Discovery**: Clients query registry for service endpoints
7. **Load Balancing**: Registry returns healthy service instances
8. **Service Deregistration**: On shutdown, remove service from registry

## Service Communication Patterns

Services communicate through well-defined patterns using service clients.

```mermaid
graph TB
    subgraph "Service A"
        ServiceA[Service A Handler]
        ClientA[Service Client]
    end
    
    subgraph "Service Registry"
        Registry[Service Registry]
    end
    
    subgraph "Service B"
        ServiceB[Service B Handler]
        ServerB[gRPC Server]
    end
    
    subgraph "Service C"
        ServiceC[Service C Handler]
    end
    
    subgraph "Event Bus"
        EventBus[Event Bus<br/>Kafka]
    end
    
    ServiceA -->|Discover| Registry
    Registry -->|Service B endpoint| ClientA
    ClientA -->|gRPC Call| ServerB
    ServerB --> ServiceB
    ServiceB -->|Response| ClientA
    
    ServiceA -->|Publish Event| EventBus
    EventBus -->|Subscribe| ServiceC
    ServiceC -->|Process Event| ServiceC
    
    style ClientA fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style ServerB fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
    style EventBus fill:#7b68ee,stroke:#5a4fcf,stroke-width:2px,color:#fff
```

### Communication Patterns

#### Synchronous Communication (gRPC/HTTP)

```mermaid
sequenceDiagram
    participant Client
    participant ServiceClient
    participant Registry
    participant Service
    
    Client->>ServiceClient: Call service method
    ServiceClient->>Registry: Discover service
    Registry-->>ServiceClient: Service endpoint
    ServiceClient->>Service: gRPC/HTTP call
    Service->>Service: Process request
    Service-->>ServiceClient: Response
    ServiceClient-->>Client: Return result
    
    alt Service unavailable
        ServiceClient->>Registry: Retry discovery
        Registry-->>ServiceClient: Alternative endpoint
        ServiceClient->>Service: Retry call
    end
```

#### Asynchronous Communication (Event Bus)

```mermaid
sequenceDiagram
    participant Publisher
    participant EventBus
    participant Kafka
    participant Subscriber1
    participant Subscriber2
    
    Publisher->>EventBus: Publish event
    EventBus->>Kafka: Send to topic
    Kafka-->>EventBus: Acknowledged
    
    Kafka->>Subscriber1: Deliver event
    Kafka->>Subscriber2: Deliver event
    
    Subscriber1->>Subscriber1: Process event
    Subscriber2->>Subscriber2: Process event
    
    Note over Subscriber1,Subscriber2: Events processed independently
```

## Service Dependency Graph

Services have dependencies that determine startup ordering and communication patterns.

```mermaid
graph TD
    subgraph "Core Services"
        Identity[Identity Service]
        Auth[Auth Service]
        Authz[Authz Service]
        Audit[Audit Service]
    end
    
    subgraph "Feature Services"
        Blog[Blog Service]
        Billing[Billing Service]
        Analytics[Analytics Service]
    end
    
    subgraph "Infrastructure Services"
        Registry[Service Registry]
        EventBus[Event Bus]
        Cache[Cache Service]
    end
    
    Auth --> Identity
    Auth --> Registry
    Authz --> Identity
    Authz --> Cache
    Authz --> Audit
    Audit --> Registry
    
    Blog --> Authz
    Blog --> Identity
    Blog --> Audit
    Blog --> Registry
    Blog --> EventBus
    Blog --> Cache
    
    Billing --> Authz
    Billing --> Identity
    Billing --> Registry
    Billing --> EventBus
    
    Analytics --> EventBus
    Analytics --> Registry
    
    style Identity fill:#4a90e2,stroke:#2e5c8a,stroke-width:3px,color:#fff
    style Auth fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Blog fill:#7b68ee,stroke:#5a4fcf,stroke-width:2px,color:#fff
```

### Dependency Types

1. **Hard Dependencies**: Service cannot start without dependency (e.g., Auth depends on Identity)
2. **Soft Dependencies**: Service can start but with degraded functionality
3. **Runtime Dependencies**: Dependencies discovered at runtime via service registry

## Service Health and Failure Handling

Services continuously report their health status, enabling automatic failure detection and recovery.

```mermaid
graph TD
    Service[Service] --> HealthCheck[Health Check Endpoint]
    HealthCheck --> CheckDB[Check Database]
    HealthCheck --> CheckCache[Check Cache]
    HealthCheck --> CheckDeps[Check Dependencies]
    
    CheckDB -->|Healthy| Aggregate[Aggregate Health]
    CheckCache -->|Healthy| Aggregate
    CheckDeps -->|Healthy| Aggregate
    
    Aggregate -->|All Healthy| Healthy[Healthy Status]
    Aggregate -->|Degraded| Degraded[Degraded Status]
    Aggregate -->|Unhealthy| Unhealthy[Unhealthy Status]
    
    Healthy --> Registry[Update Registry]
    Degraded --> Registry
    Unhealthy --> Registry
    
    Registry --> LoadBalancer[Load Balancer]
    LoadBalancer -->|Healthy Only| RouteTraffic[Route Traffic]
    LoadBalancer -->|Unhealthy| NoTraffic[No Traffic]
    
    style Healthy fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Degraded fill:#ffa500,stroke:#ff8c00,stroke-width:2px,color:#fff
    style Unhealthy fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
```

### Health Check Types

1. **Liveness Check**: Service process is running
2. **Readiness Check**: Service is ready to accept requests
3. **Dependency Checks**: Database, cache, and other dependencies are accessible
4. **Business Health**: Service-specific health indicators

### Failure Handling Strategies

#### Circuit Breaker Pattern

```mermaid
stateDiagram-v2
    [*] --> Closed: Service healthy
    Closed --> Open: Failure threshold exceeded
    Open --> HalfOpen: Timeout period
    HalfOpen --> Closed: Success
    HalfOpen --> Open: Failure
```

#### Retry Strategy

```mermaid
sequenceDiagram
    participant Client
    participant Service
    
    Client->>Service: Request
    Service-->>Client: Failure
    
    Client->>Client: Wait (exponential backoff)
    Client->>Service: Retry 1
    Service-->>Client: Failure
    
    Client->>Client: Wait (exponential backoff)
    Client->>Service: Retry 2
    Service-->>Client: Success
```

#### Service Degradation

When a service dependency fails, the service may continue operating with degraded functionality:

- **Cache Unavailable**: Service continues but without caching
- **Event Bus Unavailable**: Service continues but events are queued
- **Non-Critical Dependency Fails**: Service continues with reduced features

## Service Scaling Scenarios

Services can be scaled independently based on load and requirements.

```mermaid
graph TB
    subgraph "Load Balancer"
        LB[Load Balancer]
    end
    
    subgraph "Service Instances"
        Instance1[Service Instance 1<br/>Healthy]
        Instance2[Service Instance 2<br/>Healthy]
        Instance3[Service Instance 3<br/>Starting]
        Instance4[Service Instance 4<br/>Unhealthy]
    end
    
    subgraph "Service Registry"
        Registry[Service Registry]
    end
    
    subgraph "Infrastructure"
        DB[(Database)]
        Cache[(Cache)]
    end
    
    LB -->|Discover| Registry
    Registry -->|Healthy Instances| LB
    LB --> Instance1
    LB --> Instance2
    LB -.->|No Traffic| Instance3
    LB -.->|No Traffic| Instance4
    
    Instance1 --> DB
    Instance2 --> DB
    Instance3 --> DB
    Instance4 --> DB
    
    Instance1 --> Cache
    Instance2 --> Cache
    
    style Instance1 fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Instance2 fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Instance3 fill:#ffa500,stroke:#ff8c00,stroke-width:2px,color:#fff
    style Instance4 fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
```

### Scaling Patterns

1. **Horizontal Scaling**: Add more service instances
2. **Vertical Scaling**: Increase resources for existing instances
3. **Auto-Scaling**: Automatically scale based on metrics
4. **Load-Based Routing**: Route traffic to healthy instances only

## Integration Points

This service orchestration integrates with:

- **[System Behavior Overview](system-behavior.md)**: How services behave during startup and operation
- **[Module Integration Patterns](module-integration-patterns.md)**: How modules are loaded as services
- **[Operational Scenarios](operational-scenarios.md)**: Service interaction in specific scenarios
- **[Architecture Overview](architecture.md)**: Overall system architecture

## Related Documentation

- [System Behavior Overview](system-behavior.md) - System-level behavior
- [Module Integration Patterns](module-integration-patterns.md) - Module service integration
- [Operational Scenarios](operational-scenarios.md) - Service interaction scenarios
- [Architecture Overview](architecture.md) - System architecture
- [ADR-0029: Microservices Architecture](../adr/0029-microservices-architecture.md) - Architecture decision
- [ADR-0030: Service Communication Strategy](../adr/0030-service-communication-strategy.md) - Communication patterns