Files
spore-gateway/docs/Rollout.md
2025-10-22 19:57:48 +02:00

5.4 KiB

Rollout

The rollout feature provides orchestrated firmware updates across multiple SPORE nodes. It integrates with the spore-registry to manage firmware binaries and uses WebSocket communication for real-time progress updates.

Architecture

Components

  • spore-gateway: Orchestrates rollouts, proxies registry calls, manages WebSocket communication
  • spore-registry: Stores firmware binaries and metadata
  • spore-ui: Provides rollout interface and real-time status updates
  • SPORE Nodes: Target devices for firmware updates

Data Flow

  1. UI Discovery: Frontend queries /api/cluster/node/versions to find matching nodes
  2. Rollout Initiation: Frontend sends firmware info and node list to /api/rollout
  3. Parallel Processing: Gateway processes multiple nodes concurrently using goroutines
  4. Real-time Updates: Progress and status updates sent via WebSocket
  5. Status Display: UI shows updating status directly on cluster view nodes

API Endpoints

/api/cluster/node/versions (GET)

Returns cluster members with their current firmware versions based on the version label.

Response:

{
  "members": [
    {
      "ip": "10.0.1.134",
      "version": "1.1.0",
      "labels": {"app": "base", "role": "debug"}
    }
  ]
}

/api/rollout (POST)

Initiates a firmware rollout for specified nodes.

Request Body:

{
  "firmware": {
    "name": "my-firmware",
    "version": "1.0.0",
    "labels": {"app": "base"}
  },
  "nodes": [
    {
      "ip": "10.0.1.134",
      "version": "1.1.0",
      "labels": {"app": "base", "role": "debug"}
    }
  ]
}

Response:

{
  "success": true,
  "message": "Rollout started for 3 nodes",
  "rolloutId": "rollout_1761076653",
  "totalNodes": 3,
  "firmwareUrl": "http://localhost:3002/firmware/my-firmware/1.0.0"
}

Rollout Process

1. Firmware Lookup

  • Gateway looks up firmware in registry by name and version
  • Validates firmware exists and is accessible

2. Parallel Node Processing

  • Each node is processed in a separate goroutine
  • Uses sync.WaitGroup for coordination
  • Processes up to N nodes concurrently (where N = total nodes)

3. Node Update Sequence

For each node:

  1. Status Update: Broadcast "updating" status via WebSocket
  2. Label Update: Update node's version label to new firmware version
  3. Firmware Upload: Upload firmware binary to node
  4. Status Completion: Broadcast "online" status via WebSocket

4. Error Handling

  • Failed nodes broadcast "online" status to return to normal
  • Rollout continues for remaining nodes
  • Detailed error logging for debugging

WebSocket Communication

Message Types

rollout_progress

{
  "type": "rollout_progress",
  "rolloutId": "rollout_1761076653",
  "nodeIp": "10.0.1.134",
  "status": "uploading",
  "current": 2,
  "total": 3,
  "progress": 67,
  "timestamp": "2025-01-21T20:05:00Z"
}

Status Values:

  • updating_labels: Node labels being updated
  • uploading: Firmware being uploaded to node
  • completed: Node update completed successfully
  • failed: Node update failed

node_status_update

{
  "type": "node_status_update",
  "nodeIp": "10.0.1.134",
  "status": "updating",
  "timestamp": "2025-01-21T20:05:00Z"
}

Status Values:

  • updating: Node is being updated (blue indicator)
  • online: Node is online and operational (green indicator)

UI Behavior

Rollout Panel

  • Shows firmware details and matching nodes
  • Displays node IP, current version, and labels
  • Provides "Rollout" button to initiate process

Real-time Updates

  • Node Status: Cluster view shows blue "updating" indicator during rollout
  • Progress Tracking: Rollout panel shows individual node status
  • Completion Detection: Automatically detects when all nodes complete

Status Indicators

  • Ready: Node ready for rollout (gray)
  • Updating: Node being updated (blue, accent-secondary color)
  • Completed: Node update completed (green)
  • Failed: Node update failed (red)

Registry Integration

Firmware Lookup

  • Gateway uses FindFirmwareByNameAndVersion() for direct lookup
  • No label-based matching required
  • Ensures exact firmware version is deployed

Proxy Endpoints

All registry operations are proxied through the gateway:

  • GET /api/registry/health - Registry health check
  • GET /api/registry/firmware - List firmware
  • POST /api/registry/firmware - Upload firmware
  • GET /api/registry/firmware/{name}/{version} - Download firmware
  • PUT /api/registry/firmware/{name}/{version} - Update firmware metadata

Error Handling

Common Error Scenarios

  1. Firmware Not Found: Returns 404 with specific error message
  2. Node Communication Failure: Logs error, continues with other nodes
  3. Registry Unavailable: Returns 503 service unavailable
  4. Invalid Request: Returns 400 with validation details

Logging

  • Detailed logs for each rollout step
  • Node-specific error tracking
  • Performance metrics (upload times, success rates)

Performance Considerations

Parallel Processing

  • Multiple nodes updated simultaneously
  • Configurable concurrency limits
  • Efficient resource utilization

WebSocket Optimization

  • Batched status updates
  • Efficient message serialization
  • Connection pooling for registry calls

Memory Management

  • Streaming firmware downloads
  • Bounded goroutine pools
  • Proper resource cleanup