diff --git a/docs/Architecture.md b/docs/Architecture.md index 349353c..dfbb4ba 100644 --- a/docs/Architecture.md +++ b/docs/Architecture.md @@ -54,7 +54,52 @@ The cluster uses a UDP-based discovery protocol for automatic node detection: - **Node Info Message**: `CLUSTER_NODE_INFO::` - **Broadcast Address**: 255.255.255.255 - **Discovery Interval**: `Config.discovery_interval_ms` (default 1000 ms) -- **Listen Interval**: `Config.discovery_interval_ms / 10` (default 100 ms) +- **Listen Interval**: `Config.cluster_listen_interval_ms` (default 10 ms) +- **Heartbeat Interval**: `Config.heartbeat_interval_ms` (default 5000 ms) + +### Message Formats + +- **Discovery**: `CLUSTER_DISCOVERY` + - Sender: any node, broadcast to 255.255.255.255:`udp_port` + - Purpose: announce presence and solicit peer identification +- **Response**: `CLUSTER_RESPONSE:` + - Sender: node receiving a discovery; unicast to requester IP + - Purpose: provide hostname so requester can register/update member +- **Heartbeat**: `CLUSTER_HEARTBEAT:` + - Sender: each node, broadcast to 255.255.255.255:`udp_port` on interval + - Purpose: prompt peers to reply with their node info and keep liveness +- **Node Info**: `CLUSTER_NODE_INFO::` + - Sender: node receiving a heartbeat; unicast to heartbeat sender IP + - JSON fields: freeHeap, chipId, sdkVersion, cpuFreqMHz, flashChipSize, optional labels + +### Discovery Flow + +1. **Sender broadcasts** `CLUSTER_DISCOVERY` +2. **Each receiver responds** with `CLUSTER_RESPONSE:` to the sender IP +3. **Sender registers/updates** the node using hostname and source IP + +### Heartbeat Flow + +1. **A node broadcasts** `CLUSTER_HEARTBEAT:` +2. **Each receiver replies** with `CLUSTER_NODE_INFO::` to the heartbeat sender IP +3. **The sender**: + - Ensures the node exists or creates it with `hostname` and sender IP + - Parses JSON and updates resources, labels, `status = ACTIVE`, `lastSeen = now` + - Sets `latency = now - lastHeartbeatSentAt` (per-node, measured at heartbeat origin) + +### Listener Behavior + +The `cluster_listen` task parses one UDP packet per run and dispatches by prefix to: +- **Discovery** → send `CLUSTER_RESPONSE` +- **Heartbeat** → send `CLUSTER_NODE_INFO` JSON +- **Response** → add/update node using provided hostname and source IP +- **Node Info** → update resources/status/labels and record latency + +### Timing and Intervals + +- **UDP Port**: `Config.udp_port` (default 4210) +- **Discovery Interval**: `Config.discovery_interval_ms` (default 1000 ms) +- **Listen Interval**: `Config.cluster_listen_interval_ms` (default 10 ms) - **Heartbeat Interval**: `Config.heartbeat_interval_ms` (default 5000 ms) ### Node Status Categories @@ -73,11 +118,11 @@ The system runs several background tasks at different intervals: | Task | Interval (default) | Purpose | |------|--------------------|---------| -| `discovery_send` | 1000 ms | Send UDP discovery packets | -| `cluster_listen` | 100 ms | Listen for discovery/heartbeat/node-info | +| `cluster_discovery` | 1000 ms | Send UDP discovery packets | +| `cluster_listen` | 10 ms | Listen for discovery/heartbeat/node-info | | `status_update` | 1000 ms | Update node status categories, purge dead | | `heartbeat` | 5000 ms | Broadcast heartbeat and update local resources | -| `update_members_info` | 10000 ms | Reserved; no-op (info via UDP) | +| `cluster_update_members_info` | 10000 ms | Reserved; no-op (info via UDP) | | `print_members` | 5000 ms | Log current member list | ### Task Management Features