docs: update
This commit is contained in:
@@ -25,9 +25,9 @@ The system architecture consists of several key components working together:
|
||||
- **Service Registry**: Track available services across the cluster
|
||||
|
||||
### Task Scheduler
|
||||
- **Cooperative Multitasking**: Background task management system
|
||||
- **Task Lifecycle Management**: Automatic task execution and monitoring
|
||||
- **Resource Optimization**: Efficient task scheduling and execution
|
||||
- **Cooperative Multitasking**: Background task management system (`TaskManager`)
|
||||
- **Task Lifecycle Management**: Enable/disable tasks and set intervals at runtime
|
||||
- **Execution Model**: Tasks run in `Spore::loop()` when their interval elapses
|
||||
|
||||
### Node Context
|
||||
- **Central Context**: Shared resources and configuration
|
||||
@@ -40,27 +40,30 @@ The cluster uses a UDP-based discovery protocol for automatic node detection:
|
||||
|
||||
### Discovery Process
|
||||
|
||||
1. **Discovery Broadcast**: Nodes periodically send UDP packets on port 4210
|
||||
2. **Response Handling**: Nodes respond with their hostname and IP address
|
||||
3. **Member Management**: Discovered nodes are automatically added to the cluster
|
||||
4. **Health Monitoring**: Continuous status checking via HTTP API calls
|
||||
1. **Discovery Broadcast**: Nodes periodically send UDP packets on port `udp_port` (default 4210)
|
||||
2. **Response Handling**: Nodes respond with `CLUSTER_RESPONSE:<hostname>`
|
||||
3. **Member Management**: Discovered nodes are added/updated in the cluster
|
||||
4. **Node Info via UDP**: Heartbeat triggers peers to send `CLUSTER_NODE_INFO:<hostname>:<json>`
|
||||
|
||||
### Protocol Details
|
||||
|
||||
- **UDP Port**: 4210 (configurable)
|
||||
- **UDP Port**: 4210 (configurable via `Config.udp_port`)
|
||||
- **Discovery Message**: `CLUSTER_DISCOVERY`
|
||||
- **Response Message**: `CLUSTER_RESPONSE`
|
||||
- **Heartbeat Message**: `CLUSTER_HEARTBEAT`
|
||||
- **Node Info Message**: `CLUSTER_NODE_INFO:<hostname>:<json>`
|
||||
- **Broadcast Address**: 255.255.255.255
|
||||
- **Discovery Interval**: 1 second (configurable)
|
||||
- **Listen Interval**: 100ms (configurable)
|
||||
- **Discovery Interval**: `Config.discovery_interval_ms` (default 1000 ms)
|
||||
- **Listen Interval**: `Config.discovery_interval_ms / 10` (default 100 ms)
|
||||
- **Heartbeat Interval**: `Config.heartbeat_interval_ms` (default 5000 ms)
|
||||
|
||||
### Node Status Categories
|
||||
|
||||
Nodes are automatically categorized by their activity:
|
||||
|
||||
- **ACTIVE**: Responding within 10 seconds
|
||||
- **INACTIVE**: No response for 10-60 seconds
|
||||
- **DEAD**: No response for over 60 seconds
|
||||
- **ACTIVE**: lastSeen < `node_inactive_threshold_ms` (default 10s)
|
||||
- **INACTIVE**: < `node_dead_threshold_ms` (default 120s)
|
||||
- **DEAD**: ≥ `node_dead_threshold_ms`
|
||||
|
||||
## Task Scheduling System
|
||||
|
||||
@@ -68,14 +71,14 @@ The system runs several background tasks at different intervals:
|
||||
|
||||
### Core System Tasks
|
||||
|
||||
| Task | Interval | Purpose |
|
||||
|------|----------|---------|
|
||||
| **Discovery Send** | 1 second | Send UDP discovery packets |
|
||||
| **Discovery Listen** | 100ms | Listen for discovery responses |
|
||||
| **Status Updates** | 1 second | Monitor cluster member health |
|
||||
| **Heartbeat** | 2 seconds | Maintain cluster connectivity |
|
||||
| **Member Info** | 10 seconds | Update detailed node information |
|
||||
| **Debug Output** | 5 seconds | Print cluster status |
|
||||
| Task | Interval (default) | Purpose |
|
||||
|------|--------------------|---------|
|
||||
| `discovery_send` | 1000 ms | Send UDP discovery packets |
|
||||
| `discovery_listen` | 100 ms | Listen for discovery/heartbeat/node-info |
|
||||
| `status_update` | 1000 ms | Update node status categories, purge dead |
|
||||
| `heartbeat` | 5000 ms | Broadcast heartbeat and update local resources |
|
||||
| `update_members_info` | 10000 ms | Reserved; no-op (info via UDP) |
|
||||
| `print_members` | 5000 ms | Log current member list |
|
||||
|
||||
### Task Management Features
|
||||
|
||||
@@ -112,10 +115,7 @@ ctx.fire("cluster_updated", &clusterData);
|
||||
|
||||
### Available Events
|
||||
|
||||
- **`node_discovered`**: New node added to cluster
|
||||
- **`cluster_updated`**: Cluster membership changed
|
||||
- **`resource_update`**: Node resources updated
|
||||
- **`health_check`**: Node health status changed
|
||||
- **`node_discovered`**: New node added or local node refreshed
|
||||
|
||||
## Resource Monitoring
|
||||
|
||||
@@ -155,10 +155,8 @@ The system includes automatic WiFi fallback for robust operation:
|
||||
|
||||
### Configuration
|
||||
|
||||
- **SSID Format**: `SPORE_<MAC_LAST_4>`
|
||||
- **Password**: Configurable fallback password
|
||||
- **IP Range**: 192.168.4.x subnet
|
||||
- **Gateway**: 192.168.4.1
|
||||
- **Hostname**: Derived from MAC (`esp-<mac>`) and assigned to `ctx.hostname`
|
||||
- **AP Mode**: If STA connection fails, device switches to AP mode with configured SSID/password
|
||||
|
||||
## Cluster Topology
|
||||
|
||||
@@ -170,32 +168,30 @@ The system includes automatic WiFi fallback for robust operation:
|
||||
|
||||
### Network Architecture
|
||||
|
||||
- **Mesh-like Structure**: Nodes can communicate with each other
|
||||
- **Dynamic Routing**: Automatic path discovery between nodes
|
||||
- **Load Distribution**: Tasks distributed across available nodes
|
||||
- **Fault Tolerance**: Automatic failover and recovery
|
||||
- UDP broadcast-based discovery and heartbeats on local subnet
|
||||
- Optional HTTP polling (disabled by default; node info exchanged via UDP)
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Node Discovery
|
||||
1. **UDP Broadcast**: Nodes broadcast discovery packets on port 4210
|
||||
2. **UDP Response**: Receiving nodes responds with hostname
|
||||
2. **UDP Response**: Receiving nodes respond with hostname
|
||||
3. **Registration**: Discovered nodes are added to local cluster member list
|
||||
|
||||
### Health Monitoring
|
||||
1. **Periodic Checks**: Cluster manager polls member nodes every 1 second
|
||||
2. **Status Collection**: Each node returns resource usage and health metrics
|
||||
1. **Periodic Checks**: Cluster manager updates node status categories
|
||||
2. **Status Collection**: Each node updates resources via UDP node-info messages
|
||||
|
||||
### Task Management
|
||||
1. **Scheduling**: TaskScheduler executes registered tasks at configured intervals
|
||||
2. **Execution**: Tasks run cooperatively, yielding control to other tasks
|
||||
3. **Monitoring**: Task status and results are exposed via REST API endpoints
|
||||
1. **Scheduling**: `TaskManager` executes registered tasks at configured intervals
|
||||
2. **Execution**: Tasks run cooperatively in the main loop without preemption
|
||||
3. **Monitoring**: Task status is exposed via REST (`/api/tasks/status`)
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Memory Usage
|
||||
|
||||
- **Base System**: ~15-20KB RAM
|
||||
- **Base System**: ~15-20KB RAM (device dependent)
|
||||
- **Per Task**: ~100-200 bytes per task
|
||||
- **Cluster Members**: ~50-100 bytes per member
|
||||
- **API Endpoints**: ~20-30 bytes per endpoint
|
||||
@@ -219,7 +215,7 @@ The system includes automatic WiFi fallback for robust operation:
|
||||
### Current Implementation
|
||||
|
||||
- **Network Access**: Local network only (no internet exposure)
|
||||
- **Authentication**: None currently implemented
|
||||
- **Authentication**: None currently implemented; LAN-only access assumed
|
||||
- **Data Validation**: Basic input validation
|
||||
- **Resource Limits**: Memory and processing constraints
|
||||
|
||||
|
||||
Reference in New Issue
Block a user