Files
goplt/docs/content/architecture/service-orchestration.md
2025-11-05 11:00:36 +01:00

13 KiB

Service Orchestration

Purpose

This document explains how services work together in the Go Platform's microservices architecture, focusing on service lifecycle management, discovery, communication patterns, and failure handling.

Overview

The Go Platform consists of multiple independent services that communicate via service clients (gRPC/HTTP) and share infrastructure components. Services are discovered and registered through a service registry, enabling dynamic service location and health monitoring.

Key Concepts

  • Service: Independent process providing specific functionality
  • Service Registry: Central registry for service discovery (Consul, Kubernetes, etcd)
  • Service Client: Abstraction for inter-service communication
  • Service Discovery: Process of locating services by name
  • Service Health: Health status of a service (healthy, unhealthy, degraded)

Service Lifecycle Management

Services follow a well-defined lifecycle from startup to shutdown.

stateDiagram-v2
    [*] --> Starting: Service starts
    Starting --> Registering: Initialize services
    Registering --> StartingServer: Register with service registry
    StartingServer --> Running: Start HTTP/gRPC servers
    Running --> Healthy: Health checks pass
    Running --> Unhealthy: Health checks fail
    Unhealthy --> Running: Health checks recover
    Healthy --> Degrading: Dependency issues
    Degrading --> Healthy: Dependencies recover
    Degrading --> Unhealthy: Critical failure
    Running --> ShuttingDown: Receive shutdown signal
    ShuttingDown --> Deregistering: Stop accepting requests
    Deregistering --> Stopped: Deregister from registry
    Stopped --> [*]

Lifecycle States

  1. Starting: Service is initializing, loading configuration
  2. Registering: Service registers with service registry
  3. Starting Server: HTTP and gRPC servers starting
  4. Running: Service is running and processing requests
  5. Healthy: All health checks passing
  6. Unhealthy: Health checks failing
  7. Degrading: Service operational but with degraded functionality
  8. Shutting Down: Service received shutdown signal
  9. Deregistering: Service removing itself from registry
  10. Stopped: Service has stopped

Service Discovery and Registration

Services automatically register themselves with the service registry on startup and deregister on shutdown.

sequenceDiagram
    participant Service
    participant ServiceRegistry
    participant Registry[Consul/K8s]
    participant Client
    
    Service->>ServiceRegistry: Register(serviceInfo)
    ServiceRegistry->>Registry: Register service
    Registry->>Registry: Store service info
    Registry-->>ServiceRegistry: Registration confirmed
    ServiceRegistry-->>Service: Service registered
    
    Note over Service: Service starts health checks
    
    loop Every health check interval
        Service->>ServiceRegistry: Update health status
        ServiceRegistry->>Registry: Update health
    end
    
    Client->>ServiceRegistry: Discover(serviceName)
    ServiceRegistry->>Registry: Query services
    Registry-->>ServiceRegistry: Service list
    ServiceRegistry->>ServiceRegistry: Filter healthy services
    ServiceRegistry->>ServiceRegistry: Load balance
    ServiceRegistry-->>Client: Service endpoint
    
    Client->>Service: Connect via gRPC/HTTP
    
    Service->>ServiceRegistry: Deregister()
    ServiceRegistry->>Registry: Remove service
    Registry-->>ServiceRegistry: Service removed

Service Registration Process

  1. Service Startup: Service initializes and loads configuration
  2. Service Info Creation: Create service info with name, version, address, protocol
  3. Registry Registration: Register service with Consul/Kubernetes/etc
  4. Health Check Setup: Start health check endpoint
  5. Health Status Updates: Periodically update health status in registry
  6. Service Discovery: Clients query registry for service endpoints
  7. Load Balancing: Registry returns healthy service instances
  8. Service Deregistration: On shutdown, remove service from registry

Service Communication Patterns

Services communicate through well-defined patterns using service clients.

graph TB
    subgraph "Service A"
        ServiceA[Service A Handler]
        ClientA[Service Client]
    end
    
    subgraph "Service Registry"
        Registry[Service Registry]
    end
    
    subgraph "Service B"
        ServiceB[Service B Handler]
        ServerB[gRPC Server]
    end
    
    subgraph "Service C"
        ServiceC[Service C Handler]
    end
    
    subgraph "Event Bus"
        EventBus[Event Bus<br/>Kafka]
    end
    
    ServiceA -->|Discover| Registry
    Registry -->|Service B endpoint| ClientA
    ClientA -->|gRPC Call| ServerB
    ServerB --> ServiceB
    ServiceB -->|Response| ClientA
    
    ServiceA -->|Publish Event| EventBus
    EventBus -->|Subscribe| ServiceC
    ServiceC -->|Process Event| ServiceC
    
    style ClientA fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style ServerB fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
    style EventBus fill:#7b68ee,stroke:#5a4fcf,stroke-width:2px,color:#fff

Communication Patterns

Synchronous Communication (gRPC/HTTP)

sequenceDiagram
    participant Client
    participant ServiceClient
    participant Registry
    participant Service
    
    Client->>ServiceClient: Call service method
    ServiceClient->>Registry: Discover service
    Registry-->>ServiceClient: Service endpoint
    ServiceClient->>Service: gRPC/HTTP call
    Service->>Service: Process request
    Service-->>ServiceClient: Response
    ServiceClient-->>Client: Return result
    
    alt Service unavailable
        ServiceClient->>Registry: Retry discovery
        Registry-->>ServiceClient: Alternative endpoint
        ServiceClient->>Service: Retry call
    end

Asynchronous Communication (Event Bus)

sequenceDiagram
    participant Publisher
    participant EventBus
    participant Kafka
    participant Subscriber1
    participant Subscriber2
    
    Publisher->>EventBus: Publish event
    EventBus->>Kafka: Send to topic
    Kafka-->>EventBus: Acknowledged
    
    Kafka->>Subscriber1: Deliver event
    Kafka->>Subscriber2: Deliver event
    
    Subscriber1->>Subscriber1: Process event
    Subscriber2->>Subscriber2: Process event
    
    Note over Subscriber1,Subscriber2: Events processed independently

Service Dependency Graph

Services have dependencies that determine startup ordering and communication patterns.

graph TD
    subgraph "Core Services"
        Identity[Identity Service]
        Auth[Auth Service]
        Authz[Authz Service]
        Audit[Audit Service]
    end
    
    subgraph "Feature Services"
        Blog[Blog Service]
        Billing[Billing Service]
        Analytics[Analytics Service]
    end
    
    subgraph "Infrastructure Services"
        Registry[Service Registry]
        EventBus[Event Bus]
        Cache[Cache Service]
    end
    
    Auth --> Identity
    Auth --> Registry
    Authz --> Identity
    Authz --> Cache
    Authz --> Audit
    Audit --> Registry
    
    Blog --> Authz
    Blog --> Identity
    Blog --> Audit
    Blog --> Registry
    Blog --> EventBus
    Blog --> Cache
    
    Billing --> Authz
    Billing --> Identity
    Billing --> Registry
    Billing --> EventBus
    
    Analytics --> EventBus
    Analytics --> Registry
    
    style Identity fill:#4a90e2,stroke:#2e5c8a,stroke-width:3px,color:#fff
    style Auth fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Blog fill:#7b68ee,stroke:#5a4fcf,stroke-width:2px,color:#fff

Dependency Types

  1. Hard Dependencies: Service cannot start without dependency (e.g., Auth depends on Identity)
  2. Soft Dependencies: Service can start but with degraded functionality
  3. Runtime Dependencies: Dependencies discovered at runtime via service registry

Service Health and Failure Handling

Services continuously report their health status, enabling automatic failure detection and recovery.

graph TD
    Service[Service] --> HealthCheck[Health Check Endpoint]
    HealthCheck --> CheckDB[Check Database]
    HealthCheck --> CheckCache[Check Cache]
    HealthCheck --> CheckDeps[Check Dependencies]
    
    CheckDB -->|Healthy| Aggregate[Aggregate Health]
    CheckCache -->|Healthy| Aggregate
    CheckDeps -->|Healthy| Aggregate
    
    Aggregate -->|All Healthy| Healthy[Healthy Status]
    Aggregate -->|Degraded| Degraded[Degraded Status]
    Aggregate -->|Unhealthy| Unhealthy[Unhealthy Status]
    
    Healthy --> Registry[Update Registry]
    Degraded --> Registry
    Unhealthy --> Registry
    
    Registry --> LoadBalancer[Load Balancer]
    LoadBalancer -->|Healthy Only| RouteTraffic[Route Traffic]
    LoadBalancer -->|Unhealthy| NoTraffic[No Traffic]
    
    style Healthy fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Degraded fill:#ffa500,stroke:#ff8c00,stroke-width:2px,color:#fff
    style Unhealthy fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff

Health Check Types

  1. Liveness Check: Service process is running
  2. Readiness Check: Service is ready to accept requests
  3. Dependency Checks: Database, cache, and other dependencies are accessible
  4. Business Health: Service-specific health indicators

Failure Handling Strategies

Circuit Breaker Pattern

stateDiagram-v2
    [*] --> Closed: Service healthy
    Closed --> Open: Failure threshold exceeded
    Open --> HalfOpen: Timeout period
    HalfOpen --> Closed: Success
    HalfOpen --> Open: Failure

Retry Strategy

sequenceDiagram
    participant Client
    participant Service
    
    Client->>Service: Request
    Service-->>Client: Failure
    
    Client->>Client: Wait (exponential backoff)
    Client->>Service: Retry 1
    Service-->>Client: Failure
    
    Client->>Client: Wait (exponential backoff)
    Client->>Service: Retry 2
    Service-->>Client: Success

Service Degradation

When a service dependency fails, the service may continue operating with degraded functionality:

  • Cache Unavailable: Service continues but without caching
  • Event Bus Unavailable: Service continues but events are queued
  • Non-Critical Dependency Fails: Service continues with reduced features

Service Scaling Scenarios

Services can be scaled independently based on load and requirements.

graph TB
    subgraph "Load Balancer"
        LB[Load Balancer]
    end
    
    subgraph "Service Instances"
        Instance1[Service Instance 1<br/>Healthy]
        Instance2[Service Instance 2<br/>Healthy]
        Instance3[Service Instance 3<br/>Starting]
        Instance4[Service Instance 4<br/>Unhealthy]
    end
    
    subgraph "Service Registry"
        Registry[Service Registry]
    end
    
    subgraph "Infrastructure"
        DB[(Database)]
        Cache[(Cache)]
    end
    
    LB -->|Discover| Registry
    Registry -->|Healthy Instances| LB
    LB --> Instance1
    LB --> Instance2
    LB -.->|No Traffic| Instance3
    LB -.->|No Traffic| Instance4
    
    Instance1 --> DB
    Instance2 --> DB
    Instance3 --> DB
    Instance4 --> DB
    
    Instance1 --> Cache
    Instance2 --> Cache
    
    style Instance1 fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Instance2 fill:#50c878,stroke:#2e7d4e,stroke-width:2px,color:#fff
    style Instance3 fill:#ffa500,stroke:#ff8c00,stroke-width:2px,color:#fff
    style Instance4 fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff

Scaling Patterns

  1. Horizontal Scaling: Add more service instances
  2. Vertical Scaling: Increase resources for existing instances
  3. Auto-Scaling: Automatically scale based on metrics
  4. Load-Based Routing: Route traffic to healthy instances only

Integration Points

This service orchestration integrates with: