docs: update
This commit is contained in:
252
README.md
252
README.md
@@ -1,81 +1,255 @@
|
||||
# SPORE
|
||||
|
||||
> SProcket ORchestration Engine
|
||||
> **S**Procket **OR**chestration **E**ngine
|
||||
|
||||
SPORE is a simple cluster engine for ESP8266 microcontrollers.
|
||||
SPORE is a basic cluster orchestration engine for ESP8266 microcontrollers that provides automatic node discovery, health monitoring, and over-the-air updates in a distributed network environment.
|
||||
|
||||
## Features
|
||||
|
||||
- WiFi STA / AP
|
||||
- auto discovery over UDP
|
||||
- service registry
|
||||
- pub/sub event system
|
||||
- Over-The-Air updates
|
||||
- **WiFi Management**: Automatic WiFi STA/AP configuration with hostname generation
|
||||
- **Auto Discovery**: UDP-based node discovery with automatic cluster membership
|
||||
- **Service Registry**: Dynamic API endpoint discovery and registration
|
||||
- **Health Monitoring**: Real-time node status tracking with resource monitoring
|
||||
- **Event System**: Local and cluster-wide event publishing/subscription
|
||||
- **Over-The-Air Updates**: Seamless firmware updates across the cluster
|
||||
- **RESTful API**: HTTP-based cluster management and monitoring
|
||||
|
||||
## Supported Hardware
|
||||
|
||||
- ESP-01
|
||||
- **ESP-01** (1MB Flash)
|
||||
- **ESP-01S** (1MB Flash)
|
||||
- Other ESP8266 boards with 1MB+ flash
|
||||
|
||||
## Architecture
|
||||
|
||||
### Components
|
||||
### Core Components
|
||||
|
||||
The core architecture consists of following components:
|
||||
The system architecture consists of several key components working together:
|
||||
|
||||
- Network Manager: WiFi connection handling
|
||||
- Cluster Manager: node discovery and memberlist management
|
||||
- API Server: HTTP API for interacting with node and cluster
|
||||
- Task Scheduler: internal scheduler used for system and user defined tasks
|
||||
- **Network Manager**: WiFi connection handling and hostname configuration
|
||||
- **Cluster Manager**: Node discovery, member list management, and health monitoring
|
||||
- **API Server**: HTTP API server with dynamic endpoint registration
|
||||
- **Task Scheduler**: Cooperative multitasking system for background operations
|
||||
- **Node Context**: Central context providing event system and shared resources
|
||||
|
||||
### Auto Discovery
|
||||
### Auto Discovery Protocol
|
||||
|
||||
A node periodically executes 2 tasks responsible for auto discovery:
|
||||
The cluster uses a UDP-based discovery protocol for automatic node detection:
|
||||
|
||||
- send discovery: send UDP packet on broadcast address to discover nodes
|
||||
- listen for discovery: receive UDP packets and send response back to the node who initiated discovery
|
||||
1. **Discovery Broadcast**: Nodes periodically send UDP packets on port 4210
|
||||
2. **Response Handling**: Nodes respond with their hostname and IP address
|
||||
3. **Member Management**: Discovered nodes are automatically added to the cluster
|
||||
4. **Health Monitoring**: Continuous status checking via HTTP API calls
|
||||
|
||||
Discovered nodes are added to the so clusters memberlist.
|
||||
Another periodic task will then call the `/api/node/status` endpoint over HTTP on each node in the memberlist to get system resources and available API endpoints.
|
||||
### Task Scheduling
|
||||
|
||||
### Event System
|
||||
The system runs several background tasks at different intervals:
|
||||
|
||||
The `NodeContext` implements an event system for publishing and subscribing to local and cluster wide events (TODO).
|
||||
It is used internally for communication between different components and tasks.
|
||||
- **Discovery Tasks**: Send/listen for discovery packets (1s/100ms)
|
||||
- **Status Updates**: Monitor cluster member health (1s)
|
||||
- **Heartbeat**: Maintain cluster connectivity (2s)
|
||||
- **Member Info**: Update detailed node information (10s)
|
||||
- **Debug Output**: Print cluster status (5s)
|
||||
|
||||
## Develop
|
||||
## API Endpoints
|
||||
|
||||
### Configuration
|
||||
### Node Management
|
||||
|
||||
Choose one of your nodes as the API node to interact with the cluster and configure it in `.env`:
|
||||
```sh
|
||||
export API_NODE=10.0.1.x
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/api/node/status` | GET | Get system resources and API endpoints |
|
||||
| `/api/node/update` | POST | Upload and install firmware update |
|
||||
| `/api/node/restart` | POST | Restart the node |
|
||||
|
||||
### Cluster Management
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/api/cluster/members` | GET | Get cluster membership and status |
|
||||
|
||||
### Node Status Response
|
||||
|
||||
```json
|
||||
{
|
||||
"freeHeap": 12345,
|
||||
"chipId": 12345678,
|
||||
"sdkVersion": "2.2.2-dev(38a443e)",
|
||||
"cpuFreqMHz": 80,
|
||||
"flashChipSize": 1048576,
|
||||
"api": [
|
||||
{
|
||||
"uri": "/api/node/status",
|
||||
"method": "GET"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Build
|
||||
### Cluster Members Response
|
||||
|
||||
```json
|
||||
{
|
||||
"members": [
|
||||
{
|
||||
"hostname": "esp_123456",
|
||||
"ip": "192.168.1.100",
|
||||
"lastSeen": 1234567890,
|
||||
"latency": 5,
|
||||
"status": "ACTIVE",
|
||||
"resources": {
|
||||
"freeHeap": 12345,
|
||||
"chipId": 12345678,
|
||||
"sdkVersion": "2.2.2-dev(38a443e)",
|
||||
"cpuFreqMHz": 80,
|
||||
"flashChipSize": 1048576
|
||||
},
|
||||
"api": [
|
||||
{
|
||||
"uri": "/api/node/status",
|
||||
"method": "GET"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Setup
|
||||
|
||||
Create a `.env` file in your project root:
|
||||
|
||||
```bash
|
||||
# API node IP for cluster management
|
||||
export API_NODE=192.168.1.100
|
||||
|
||||
# WiFi credentials (optional, can be configured in code)
|
||||
export WIFI_SSID=your_network
|
||||
export WIFI_PASSWORD=your_password
|
||||
```
|
||||
|
||||
### PlatformIO Configuration
|
||||
|
||||
The project uses PlatformIO with the following configuration:
|
||||
|
||||
- **Framework**: Arduino
|
||||
- **Board**: ESP-01 with 1MB flash
|
||||
- **Upload Speed**: 115200 baud
|
||||
- **Flash Mode**: DOUT (required for ESP-01S)
|
||||
|
||||
## Development
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- PlatformIO Core or PlatformIO IDE
|
||||
- ESP8266 development tools
|
||||
- `jq` for JSON processing in scripts
|
||||
|
||||
### Building
|
||||
|
||||
Build the firmware:
|
||||
|
||||
```sh
|
||||
```bash
|
||||
./ctl.sh build
|
||||
```
|
||||
|
||||
### Flash
|
||||
### Flashing
|
||||
|
||||
Flash firmware to a connected device:
|
||||
|
||||
```sh
|
||||
```bash
|
||||
./ctl.sh flash
|
||||
```
|
||||
### OTA
|
||||
|
||||
Update one nodes:
|
||||
### Over-The-Air Updates
|
||||
|
||||
```sh
|
||||
./ctl.sh ota update 10.0.1.x
|
||||
Update a specific node:
|
||||
|
||||
```bash
|
||||
./ctl.sh ota update 192.168.1.100
|
||||
```
|
||||
|
||||
Update all nodes:
|
||||
Update all nodes in the cluster:
|
||||
|
||||
```sh
|
||||
```bash
|
||||
./ctl.sh ota all
|
||||
```
|
||||
```
|
||||
|
||||
### Cluster Management
|
||||
|
||||
View cluster members:
|
||||
|
||||
```bash
|
||||
./ctl.sh cluster members
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Event System
|
||||
|
||||
The `NodeContext` provides an event-driven architecture:
|
||||
|
||||
```cpp
|
||||
// Subscribe to events
|
||||
ctx.on("node_discovered", [](void* data) {
|
||||
NodeInfo* node = static_cast<NodeInfo*>(data);
|
||||
// Handle new node discovery
|
||||
});
|
||||
|
||||
// Publish events
|
||||
ctx.fire("node_discovered", &newNode);
|
||||
```
|
||||
|
||||
### Node Status Tracking
|
||||
|
||||
Nodes are automatically categorized by their activity:
|
||||
|
||||
- **ACTIVE**: Responding within 10 seconds
|
||||
- **INACTIVE**: No response for 10-60 seconds
|
||||
- **DEAD**: No response for over 60 seconds
|
||||
|
||||
### Resource Monitoring
|
||||
|
||||
Each node tracks:
|
||||
- Free heap memory
|
||||
- Chip ID and SDK version
|
||||
- CPU frequency
|
||||
- Flash chip size
|
||||
- API endpoint registry
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Discovery Failures**: Check UDP port 4210 is not blocked
|
||||
2. **WiFi Connection**: Verify SSID/password in NetworkManager
|
||||
3. **OTA Updates**: Ensure sufficient flash space (1MB minimum)
|
||||
4. **Cluster Split**: Check network connectivity between nodes
|
||||
|
||||
### Debug Output
|
||||
|
||||
Enable serial monitoring to see cluster activity:
|
||||
|
||||
```bash
|
||||
pio device monitor
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Make your changes
|
||||
4. Test thoroughly on ESP8266 hardware
|
||||
5. Submit a pull request
|
||||
|
||||
## License
|
||||
|
||||
[Add your license information here]
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- Built with [PlatformIO](https://platformio.org/)
|
||||
- Uses [TaskScheduler](https://github.com/arkhipenko/TaskScheduler) for cooperative multitasking
|
||||
- [ESPAsyncWebServer](https://github.com/me-no-dev/ESPAsyncWebServer) for HTTP API
|
||||
- [ArduinoJson](https://arduinojson.org/) for JSON processing
|
||||
Reference in New Issue
Block a user