From 175ed8cae8d119684082c3c25d95592865865f0f Mon Sep 17 00:00:00 2001 From: Patrick Balsiger Date: Thu, 21 Aug 2025 20:19:16 +0200 Subject: [PATCH] docs: update --- README.md | 252 +++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 213 insertions(+), 39 deletions(-) diff --git a/README.md b/README.md index 13f6d96..3f5fb4e 100644 --- a/README.md +++ b/README.md @@ -1,81 +1,255 @@ # SPORE -> SProcket ORchestration Engine +> **S**Procket **OR**chestration **E**ngine -SPORE is a simple cluster engine for ESP8266 microcontrollers. +SPORE is a basic cluster orchestration engine for ESP8266 microcontrollers that provides automatic node discovery, health monitoring, and over-the-air updates in a distributed network environment. ## Features -- WiFi STA / AP -- auto discovery over UDP -- service registry -- pub/sub event system -- Over-The-Air updates +- **WiFi Management**: Automatic WiFi STA/AP configuration with hostname generation +- **Auto Discovery**: UDP-based node discovery with automatic cluster membership +- **Service Registry**: Dynamic API endpoint discovery and registration +- **Health Monitoring**: Real-time node status tracking with resource monitoring +- **Event System**: Local and cluster-wide event publishing/subscription +- **Over-The-Air Updates**: Seamless firmware updates across the cluster +- **RESTful API**: HTTP-based cluster management and monitoring ## Supported Hardware -- ESP-01 +- **ESP-01** (1MB Flash) +- **ESP-01S** (1MB Flash) +- Other ESP8266 boards with 1MB+ flash ## Architecture -### Components +### Core Components -The core architecture consists of following components: +The system architecture consists of several key components working together: -- Network Manager: WiFi connection handling -- Cluster Manager: node discovery and memberlist management -- API Server: HTTP API for interacting with node and cluster -- Task Scheduler: internal scheduler used for system and user defined tasks +- **Network Manager**: WiFi connection handling and hostname configuration +- **Cluster Manager**: Node discovery, member list management, and health monitoring +- **API Server**: HTTP API server with dynamic endpoint registration +- **Task Scheduler**: Cooperative multitasking system for background operations +- **Node Context**: Central context providing event system and shared resources -### Auto Discovery +### Auto Discovery Protocol -A node periodically executes 2 tasks responsible for auto discovery: +The cluster uses a UDP-based discovery protocol for automatic node detection: -- send discovery: send UDP packet on broadcast address to discover nodes -- listen for discovery: receive UDP packets and send response back to the node who initiated discovery +1. **Discovery Broadcast**: Nodes periodically send UDP packets on port 4210 +2. **Response Handling**: Nodes respond with their hostname and IP address +3. **Member Management**: Discovered nodes are automatically added to the cluster +4. **Health Monitoring**: Continuous status checking via HTTP API calls -Discovered nodes are added to the so clusters memberlist. -Another periodic task will then call the `/api/node/status` endpoint over HTTP on each node in the memberlist to get system resources and available API endpoints. +### Task Scheduling -### Event System +The system runs several background tasks at different intervals: -The `NodeContext` implements an event system for publishing and subscribing to local and cluster wide events (TODO). -It is used internally for communication between different components and tasks. +- **Discovery Tasks**: Send/listen for discovery packets (1s/100ms) +- **Status Updates**: Monitor cluster member health (1s) +- **Heartbeat**: Maintain cluster connectivity (2s) +- **Member Info**: Update detailed node information (10s) +- **Debug Output**: Print cluster status (5s) -## Develop +## API Endpoints -### Configuration +### Node Management -Choose one of your nodes as the API node to interact with the cluster and configure it in `.env`: -```sh -export API_NODE=10.0.1.x +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/node/status` | GET | Get system resources and API endpoints | +| `/api/node/update` | POST | Upload and install firmware update | +| `/api/node/restart` | POST | Restart the node | + +### Cluster Management + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/cluster/members` | GET | Get cluster membership and status | + +### Node Status Response + +```json +{ + "freeHeap": 12345, + "chipId": 12345678, + "sdkVersion": "2.2.2-dev(38a443e)", + "cpuFreqMHz": 80, + "flashChipSize": 1048576, + "api": [ + { + "uri": "/api/node/status", + "method": "GET" + } + ] +} ``` -### Build +### Cluster Members Response + +```json +{ + "members": [ + { + "hostname": "esp_123456", + "ip": "192.168.1.100", + "lastSeen": 1234567890, + "latency": 5, + "status": "ACTIVE", + "resources": { + "freeHeap": 12345, + "chipId": 12345678, + "sdkVersion": "2.2.2-dev(38a443e)", + "cpuFreqMHz": 80, + "flashChipSize": 1048576 + }, + "api": [ + { + "uri": "/api/node/status", + "method": "GET" + } + ] + } + ] +} +``` + +## Configuration + +### Environment Setup + +Create a `.env` file in your project root: + +```bash +# API node IP for cluster management +export API_NODE=192.168.1.100 + +# WiFi credentials (optional, can be configured in code) +export WIFI_SSID=your_network +export WIFI_PASSWORD=your_password +``` + +### PlatformIO Configuration + +The project uses PlatformIO with the following configuration: + +- **Framework**: Arduino +- **Board**: ESP-01 with 1MB flash +- **Upload Speed**: 115200 baud +- **Flash Mode**: DOUT (required for ESP-01S) + +## Development + +### Prerequisites + +- PlatformIO Core or PlatformIO IDE +- ESP8266 development tools +- `jq` for JSON processing in scripts + +### Building Build the firmware: -```sh +```bash ./ctl.sh build ``` -### Flash +### Flashing Flash firmware to a connected device: -```sh +```bash ./ctl.sh flash ``` -### OTA -Update one nodes: +### Over-The-Air Updates -```sh -./ctl.sh ota update 10.0.1.x +Update a specific node: + +```bash +./ctl.sh ota update 192.168.1.100 ``` -Update all nodes: +Update all nodes in the cluster: -```sh +```bash ./ctl.sh ota all -``` \ No newline at end of file +``` + +### Cluster Management + +View cluster members: + +```bash +./ctl.sh cluster members +``` + +## Implementation Details + +### Event System + +The `NodeContext` provides an event-driven architecture: + +```cpp +// Subscribe to events +ctx.on("node_discovered", [](void* data) { + NodeInfo* node = static_cast(data); + // Handle new node discovery +}); + +// Publish events +ctx.fire("node_discovered", &newNode); +``` + +### Node Status Tracking + +Nodes are automatically categorized by their activity: + +- **ACTIVE**: Responding within 10 seconds +- **INACTIVE**: No response for 10-60 seconds +- **DEAD**: No response for over 60 seconds + +### Resource Monitoring + +Each node tracks: +- Free heap memory +- Chip ID and SDK version +- CPU frequency +- Flash chip size +- API endpoint registry + +## Troubleshooting + +### Common Issues + +1. **Discovery Failures**: Check UDP port 4210 is not blocked +2. **WiFi Connection**: Verify SSID/password in NetworkManager +3. **OTA Updates**: Ensure sufficient flash space (1MB minimum) +4. **Cluster Split**: Check network connectivity between nodes + +### Debug Output + +Enable serial monitoring to see cluster activity: + +```bash +pio device monitor +``` + +## Contributing + +1. Fork the repository +2. Create a feature branch +3. Make your changes +4. Test thoroughly on ESP8266 hardware +5. Submit a pull request + +## License + +[Add your license information here] + +## Acknowledgments + +- Built with [PlatformIO](https://platformio.org/) +- Uses [TaskScheduler](https://github.com/arkhipenko/TaskScheduler) for cooperative multitasking +- [ESPAsyncWebServer](https://github.com/me-no-dev/ESPAsyncWebServer) for HTTP API +- [ArduinoJson](https://arduinojson.org/) for JSON processing \ No newline at end of file