feature/node-info-sync #8
@@ -54,7 +54,52 @@ The cluster uses a UDP-based discovery protocol for automatic node detection:
|
|||||||
- **Node Info Message**: `CLUSTER_NODE_INFO:<hostname>:<json>`
|
- **Node Info Message**: `CLUSTER_NODE_INFO:<hostname>:<json>`
|
||||||
- **Broadcast Address**: 255.255.255.255
|
- **Broadcast Address**: 255.255.255.255
|
||||||
- **Discovery Interval**: `Config.discovery_interval_ms` (default 1000 ms)
|
- **Discovery Interval**: `Config.discovery_interval_ms` (default 1000 ms)
|
||||||
- **Listen Interval**: `Config.discovery_interval_ms / 10` (default 100 ms)
|
- **Listen Interval**: `Config.cluster_listen_interval_ms` (default 10 ms)
|
||||||
|
- **Heartbeat Interval**: `Config.heartbeat_interval_ms` (default 5000 ms)
|
||||||
|
|
||||||
|
### Message Formats
|
||||||
|
|
||||||
|
- **Discovery**: `CLUSTER_DISCOVERY`
|
||||||
|
- Sender: any node, broadcast to 255.255.255.255:`udp_port`
|
||||||
|
- Purpose: announce presence and solicit peer identification
|
||||||
|
- **Response**: `CLUSTER_RESPONSE:<hostname>`
|
||||||
|
- Sender: node receiving a discovery; unicast to requester IP
|
||||||
|
- Purpose: provide hostname so requester can register/update member
|
||||||
|
- **Heartbeat**: `CLUSTER_HEARTBEAT:<hostname>`
|
||||||
|
- Sender: each node, broadcast to 255.255.255.255:`udp_port` on interval
|
||||||
|
- Purpose: prompt peers to reply with their node info and keep liveness
|
||||||
|
- **Node Info**: `CLUSTER_NODE_INFO:<hostname>:<json>`
|
||||||
|
- Sender: node receiving a heartbeat; unicast to heartbeat sender IP
|
||||||
|
- JSON fields: freeHeap, chipId, sdkVersion, cpuFreqMHz, flashChipSize, optional labels
|
||||||
|
|
||||||
|
### Discovery Flow
|
||||||
|
|
||||||
|
1. **Sender broadcasts** `CLUSTER_DISCOVERY`
|
||||||
|
2. **Each receiver responds** with `CLUSTER_RESPONSE:<hostname>` to the sender IP
|
||||||
|
3. **Sender registers/updates** the node using hostname and source IP
|
||||||
|
|
||||||
|
### Heartbeat Flow
|
||||||
|
|
||||||
|
1. **A node broadcasts** `CLUSTER_HEARTBEAT:<hostname>`
|
||||||
|
2. **Each receiver replies** with `CLUSTER_NODE_INFO:<hostname>:<json>` to the heartbeat sender IP
|
||||||
|
3. **The sender**:
|
||||||
|
- Ensures the node exists or creates it with `hostname` and sender IP
|
||||||
|
- Parses JSON and updates resources, labels, `status = ACTIVE`, `lastSeen = now`
|
||||||
|
- Sets `latency = now - lastHeartbeatSentAt` (per-node, measured at heartbeat origin)
|
||||||
|
|
||||||
|
### Listener Behavior
|
||||||
|
|
||||||
|
The `cluster_listen` task parses one UDP packet per run and dispatches by prefix to:
|
||||||
|
- **Discovery** → send `CLUSTER_RESPONSE`
|
||||||
|
- **Heartbeat** → send `CLUSTER_NODE_INFO` JSON
|
||||||
|
- **Response** → add/update node using provided hostname and source IP
|
||||||
|
- **Node Info** → update resources/status/labels and record latency
|
||||||
|
|
||||||
|
### Timing and Intervals
|
||||||
|
|
||||||
|
- **UDP Port**: `Config.udp_port` (default 4210)
|
||||||
|
- **Discovery Interval**: `Config.discovery_interval_ms` (default 1000 ms)
|
||||||
|
- **Listen Interval**: `Config.cluster_listen_interval_ms` (default 10 ms)
|
||||||
- **Heartbeat Interval**: `Config.heartbeat_interval_ms` (default 5000 ms)
|
- **Heartbeat Interval**: `Config.heartbeat_interval_ms` (default 5000 ms)
|
||||||
|
|
||||||
### Node Status Categories
|
### Node Status Categories
|
||||||
@@ -73,11 +118,11 @@ The system runs several background tasks at different intervals:
|
|||||||
|
|
||||||
| Task | Interval (default) | Purpose |
|
| Task | Interval (default) | Purpose |
|
||||||
|------|--------------------|---------|
|
|------|--------------------|---------|
|
||||||
| `discovery_send` | 1000 ms | Send UDP discovery packets |
|
| `cluster_discovery` | 1000 ms | Send UDP discovery packets |
|
||||||
| `cluster_listen` | 100 ms | Listen for discovery/heartbeat/node-info |
|
| `cluster_listen` | 10 ms | Listen for discovery/heartbeat/node-info |
|
||||||
| `status_update` | 1000 ms | Update node status categories, purge dead |
|
| `status_update` | 1000 ms | Update node status categories, purge dead |
|
||||||
| `heartbeat` | 5000 ms | Broadcast heartbeat and update local resources |
|
| `heartbeat` | 5000 ms | Broadcast heartbeat and update local resources |
|
||||||
| `update_members_info` | 10000 ms | Reserved; no-op (info via UDP) |
|
| `cluster_update_members_info` | 10000 ms | Reserved; no-op (info via UDP) |
|
||||||
| `print_members` | 5000 ms | Log current member list |
|
| `print_members` | 5000 ms | Log current member list |
|
||||||
|
|
||||||
### Task Management Features
|
### Task Management Features
|
||||||
|
|||||||
Reference in New Issue
Block a user