feat: task manager endpoint, updated documentation

This commit is contained in:
2025-08-22 15:47:08 +02:00
parent d7d307e3ce
commit 30a5f8b8cb
14 changed files with 2550 additions and 551 deletions

279
docs/API.md Normal file
View File

@@ -0,0 +1,279 @@
# SPORE API Documentation
The SPORE system provides a comprehensive RESTful API for monitoring and controlling the embedded device. All endpoints return JSON responses and support standard HTTP status codes.
## Quick Reference
### Task Management API
| Endpoint | Method | Description | Parameters | Response |
|----------|--------|-------------|------------|----------|
| `/api/tasks/status` | GET | Get comprehensive status of all tasks and system information | None | Task status overview with system metrics |
| `/api/tasks/control` | POST | Control individual task operations | `task`, `action` | Operation result with task details |
### System Status API
| Endpoint | Method | Description | Response |
|----------|--------|-------------|----------|
| `/api/node/status` | GET | System resource information and API endpoint registry | System metrics and API catalog |
| `/api/cluster/members` | GET | Cluster membership and node health information | Cluster topology and health status |
| `/api/node/update` | POST | Handle firmware updates via OTA | Update progress and status |
| `/api/node/restart` | POST | Trigger system restart | Restart confirmation |
## Detailed API Reference
### Task Management
#### GET /api/tasks/status
Returns comprehensive status information for all registered tasks, including system resource metrics and task execution details.
**Response Fields:**
| Field | Type | Description |
|-------|------|-------------|
| `summary.totalTasks` | integer | Total number of registered tasks |
| `summary.activeTasks` | integer | Number of currently enabled tasks |
| `tasks[].name` | string | Unique task identifier |
| `tasks[].interval` | integer | Execution frequency in milliseconds |
| `tasks[].enabled` | boolean | Whether task is currently enabled |
| `tasks[].running` | boolean | Whether task is actively executing |
| `tasks[].autoStart` | boolean | Whether task starts automatically |
| `system.freeHeap` | integer | Available RAM in bytes |
| `system.uptime` | integer | System uptime in milliseconds |
**Example Response:**
```json
{
"summary": {
"totalTasks": 6,
"activeTasks": 5
},
"tasks": [
{
"name": "discovery_send",
"interval": 1000,
"enabled": true,
"running": true,
"autoStart": true
}
],
"system": {
"freeHeap": 48748,
"uptime": 12345
}
}
```
#### POST /api/tasks/control
Controls the execution state of individual tasks. Supports enabling, disabling, starting, stopping, and getting detailed status for specific tasks.
**Parameters:**
- `task` (required): Name of the task to control
- `action` (required): Action to perform
**Available Actions:**
| Action | Description | Use Case |
|--------|-------------|----------|
| `enable` | Enable a disabled task | Resume background operations |
| `disable` | Disable a running task | Pause resource-intensive tasks |
| `start` | Start a stopped task | Begin task execution |
| `stop` | Stop a running task | Halt task execution |
| `status` | Get detailed status for a specific task | Monitor individual task health |
**Example Response:**
```json
{
"success": true,
"message": "Task enabled",
"task": "heartbeat",
"action": "enable"
}
```
**Task Status Response:**
```json
{
"success": true,
"message": "Task status retrieved",
"task": "discovery_send",
"action": "status",
"taskDetails": {
"name": "discovery_send",
"enabled": true,
"running": true,
"interval": 1000,
"system": {
"freeHeap": 48748,
"uptime": 12345
}
}
}
```
### System Status
#### GET /api/node/status
Returns comprehensive system resource information including memory usage, chip details, and a registry of all available API endpoints.
**Response Fields:**
- `freeHeap`: Available RAM in bytes
- `chipId`: ESP8266 chip ID
- `sdkVersion`: ESP8266 SDK version
- `cpuFreqMHz`: CPU frequency in MHz
- `flashChipSize`: Flash chip size in bytes
- `api`: Array of registered API endpoints
#### GET /api/cluster/members
Returns information about all nodes in the cluster, including their health status, resources, and API endpoints.
**Response Fields:**
- `members[]`: Array of cluster node information
- `hostname`: Node hostname
- `ip`: Node IP address
- `lastSeen`: Timestamp of last communication
- `latency`: Network latency in milliseconds
- `status`: Node health status (ACTIVE, INACTIVE, DEAD)
- `resources`: System resource information
- `api`: Available API endpoints
### System Management
#### POST /api/node/update
Initiates an over-the-air firmware update. The firmware file should be uploaded as multipart/form-data.
**Parameters:**
- `firmware`: Firmware binary file (.bin)
#### POST /api/node/restart
Triggers a system restart. The response will be sent before the restart occurs.
## HTTP Status Codes
| Code | Description | Use Case |
|------|-------------|----------|
| 200 | Success | Operation completed successfully |
| 400 | Bad Request | Invalid parameters or action |
| 404 | Not Found | Task or endpoint not found |
| 500 | Internal Server Error | System error occurred |
## OpenAPI Specification
A complete OpenAPI 3.0 specification is available in the [`api/`](../api/) folder. This specification can be used to:
- Generate client libraries in multiple programming languages
- Create interactive API documentation
- Validate API requests and responses
- Generate mock servers for testing
- Integrate with API management platforms
See [`api/README.md`](../api/README.md) for detailed usage instructions.
## Usage Examples
### Basic Task Status Check
```bash
curl -s http://10.0.1.60/api/tasks/status | jq '.'
```
### Task Control
```bash
# Disable a task
curl -X POST http://10.0.1.60/api/tasks/control \
-d "task=heartbeat&action=disable"
# Get detailed status
curl -X POST http://10.0.1.60/api/tasks/control \
-d "task=discovery_send&action=status"
```
### System Monitoring
```bash
# Check system resources
curl -s http://10.0.1.60/api/node/status | jq '.freeHeap'
# Monitor cluster health
curl -s http://10.0.1.60/api/cluster/members | jq '.members[].status'
```
## Integration Examples
### Python Client
```python
import requests
# Get task status
response = requests.get('http://10.0.1.60/api/tasks/status')
tasks = response.json()
# Check active tasks
active_count = tasks['summary']['activeTasks']
print(f"Active tasks: {active_count}")
# Control a task
control_data = {'task': 'heartbeat', 'action': 'disable'}
response = requests.post('http://10.0.1.60/api/tasks/control', data=control_data)
```
### JavaScript Client
```javascript
// Get task status
fetch('http://10.0.1.60/api/tasks/status')
.then(response => response.json())
.then(data => {
console.log(`Total tasks: ${data.summary.totalTasks}`);
console.log(`Active tasks: ${data.summary.activeTasks}`);
});
// Control a task
fetch('http://10.0.1.60/api/tasks/control', {
method: 'POST',
headers: {'Content-Type': 'application/x-www-form-urlencoded'},
body: 'task=heartbeat&action=disable'
});
```
## Task Management Examples
### Monitoring Task Health
```bash
# Check overall task status
curl -s http://10.0.1.60/api/tasks/status | jq '.'
# Monitor specific task
curl -s -X POST http://10.0.1.60/api/tasks/control \
-d "task=heartbeat&action=status" | jq '.'
# Watch for low memory conditions
watch -n 5 'curl -s http://10.0.1.60/api/tasks/status | jq ".system.freeHeap"'
```
### Task Control Workflows
```bash
# Temporarily disable discovery to reduce network traffic
curl -X POST http://10.0.1.60/api/tasks/control \
-d "task=discovery_send&action=disable"
# Check if it's disabled
curl -s -X POST http://10.0.1.60/api/tasks/control \
-d "task=discovery_send&action=status" | jq '.taskDetails.enabled'
# Re-enable when needed
curl -X POST http://10.0.1.60/api/tasks/control \
-d "task=discovery_send&action=enable"
```
### Cluster Health Monitoring
```bash
# Monitor all nodes in cluster
for ip in 10.0.1.60 10.0.1.61 10.0.1.62; do
echo "=== Node $ip ==="
curl -s "http://$ip/api/tasks/status" | jq '.summary'
done
```

358
docs/Architecture.md Normal file
View File

@@ -0,0 +1,358 @@
# SPORE Architecture & Implementation
## System Overview
SPORE (SProcket ORchestration Engine) is a cluster engine for ESP8266 microcontrollers that provides automatic node discovery, health monitoring, and over-the-air updates in a distributed network environment.
## Core Components
The system architecture consists of several key components working together:
### Network Manager
- **WiFi Connection Handling**: Automatic WiFi STA/AP configuration
- **Hostname Configuration**: MAC-based hostname generation
- **Fallback Management**: Automatic access point creation if WiFi connection fails
### Cluster Manager
- **Node Discovery**: UDP-based automatic node detection
- **Member List Management**: Dynamic cluster membership tracking
- **Health Monitoring**: Continuous node status checking
- **Resource Tracking**: Monitor node resources and capabilities
### API Server
- **HTTP API Server**: RESTful API for cluster management
- **Dynamic Endpoint Registration**: Automatic API endpoint discovery
- **Service Registry**: Track available services across the cluster
### Task Scheduler
- **Cooperative Multitasking**: Background task management system
- **Task Lifecycle Management**: Automatic task execution and monitoring
- **Resource Optimization**: Efficient task scheduling and execution
### Node Context
- **Central Context**: Shared resources and configuration
- **Event System**: Local and cluster-wide event publishing/subscription
- **Resource Management**: Centralized resource allocation and monitoring
## Auto Discovery Protocol
The cluster uses a UDP-based discovery protocol for automatic node detection:
### Discovery Process
1. **Discovery Broadcast**: Nodes periodically send UDP packets on port 4210
2. **Response Handling**: Nodes respond with their hostname and IP address
3. **Member Management**: Discovered nodes are automatically added to the cluster
4. **Health Monitoring**: Continuous status checking via HTTP API calls
### Protocol Details
- **UDP Port**: 4210 (configurable)
- **Discovery Message**: `CLUSTER_DISCOVERY`
- **Response Message**: `CLUSTER_RESPONSE`
- **Broadcast Address**: 255.255.255.255
- **Discovery Interval**: 1 second (configurable)
- **Listen Interval**: 100ms (configurable)
### Node Status Categories
Nodes are automatically categorized by their activity:
- **ACTIVE**: Responding within 10 seconds
- **INACTIVE**: No response for 10-60 seconds
- **DEAD**: No response for over 60 seconds
## Task Scheduling System
The system runs several background tasks at different intervals:
### Core System Tasks
| Task | Interval | Purpose |
|------|----------|---------|
| **Discovery Send** | 1 second | Send UDP discovery packets |
| **Discovery Listen** | 100ms | Listen for discovery responses |
| **Status Updates** | 1 second | Monitor cluster member health |
| **Heartbeat** | 2 seconds | Maintain cluster connectivity |
| **Member Info** | 10 seconds | Update detailed node information |
| **Debug Output** | 5 seconds | Print cluster status |
### Task Management Features
- **Dynamic Intervals**: Change execution frequency on-the-fly
- **Runtime Control**: Enable/disable tasks without restart
- **Status Monitoring**: Real-time task health tracking
- **Resource Integration**: View task status with system resources
## Event System
The `NodeContext` provides an event-driven architecture for system-wide communication:
### Event Subscription
```cpp
// Subscribe to events
ctx.on("node_discovered", [](void* data) {
NodeInfo* node = static_cast<NodeInfo*>(data);
// Handle new node discovery
});
ctx.on("cluster_updated", [](void* data) {
// Handle cluster membership changes
});
```
### Event Publishing
```cpp
// Publish events
ctx.fire("node_discovered", &newNode);
ctx.fire("cluster_updated", &clusterData);
```
### Available Events
- **`node_discovered`**: New node added to cluster
- **`cluster_updated`**: Cluster membership changed
- **`resource_update`**: Node resources updated
- **`health_check`**: Node health status changed
## Resource Monitoring
Each node tracks comprehensive system resources:
### System Resources
- **Free Heap Memory**: Available RAM in bytes
- **Chip ID**: Unique ESP8266 identifier
- **SDK Version**: ESP8266 firmware version
- **CPU Frequency**: Operating frequency in MHz
- **Flash Chip Size**: Total flash storage in bytes
### API Endpoint Registry
- **Dynamic Discovery**: Automatically detect available endpoints
- **Method Information**: HTTP method (GET, POST, etc.)
- **Service Catalog**: Complete service registry across cluster
### Health Metrics
- **Response Time**: API response latency
- **Uptime**: System uptime in milliseconds
- **Connection Status**: Network connectivity health
- **Resource Utilization**: Memory and CPU usage
## WiFi Fallback System
The system includes automatic WiFi fallback for robust operation:
### Fallback Process
1. **Primary Connection**: Attempts to connect to configured WiFi network
2. **Connection Failure**: If connection fails, creates an access point
3. **Hostname Generation**: Automatically generates hostname from MAC address
4. **Service Continuity**: Maintains cluster functionality in fallback mode
### Configuration
- **SSID Format**: `SPORE_<MAC_LAST_4>`
- **Password**: Configurable fallback password
- **IP Range**: 192.168.4.x subnet
- **Gateway**: 192.168.4.1
## Cluster Topology
### Node Types
- **Master Node**: Primary cluster coordinator (if applicable)
- **Worker Nodes**: Standard cluster members
- **Edge Nodes**: Network edge devices
### Network Architecture
- **Mesh-like Structure**: Nodes can communicate with each other
- **Dynamic Routing**: Automatic path discovery between nodes
- **Load Distribution**: Tasks distributed across available nodes
- **Fault Tolerance**: Automatic failover and recovery
## Data Flow
### Discovery Flow
```
Node A → UDP Broadcast → Node B
Node B → HTTP Response → Node A
Node A → Add to Cluster → Update Member List
```
### Health Monitoring Flow
```
Cluster Manager → HTTP Request → Node Status
Node → JSON Response → Resource Information
Cluster Manager → Update Health → Fire Events
```
### Task Execution Flow
```
Task Scheduler → Check Intervals → Execute Tasks
Task → Update Status → API Server
API Server → JSON Response → Client
```
## Performance Characteristics
### Memory Usage
- **Base System**: ~15-20KB RAM
- **Per Task**: ~100-200 bytes per task
- **Cluster Members**: ~50-100 bytes per member
- **API Endpoints**: ~20-30 bytes per endpoint
### Network Overhead
- **Discovery Packets**: 64 bytes every 1 second
- **Health Checks**: ~200-500 bytes every 1 second
- **Status Updates**: ~1-2KB per node
- **API Responses**: Varies by endpoint (typically 100B-5KB)
### Processing Overhead
- **Task Execution**: Minimal overhead per task
- **Event Processing**: Fast event dispatch
- **JSON Parsing**: Efficient ArduinoJson usage
- **Network I/O**: Asynchronous operations
## Security Considerations
### Current Implementation
- **Network Access**: Local network only (no internet exposure)
- **Authentication**: None currently implemented
- **Data Validation**: Basic input validation
- **Resource Limits**: Memory and processing constraints
### Future Enhancements
- **TLS/SSL**: Encrypted communications
- **API Keys**: Authentication for API access
- **Access Control**: Role-based permissions
- **Audit Logging**: Security event tracking
## Scalability
### Cluster Size Limits
- **Theoretical**: Up to 255 nodes (IP subnet limit)
- **Practical**: 20-50 nodes for optimal performance
- **Memory Constraint**: ~8KB available for member tracking
- **Network Constraint**: UDP packet size limits
### Performance Scaling
- **Linear Scaling**: Most operations scale linearly with node count
- **Discovery Overhead**: Increases with cluster size
- **Health Monitoring**: Parallel HTTP requests
- **Task Management**: Independent per-node execution
## Configuration Management
### Environment Variables
```bash
# API node IP for cluster management
export API_NODE=192.168.1.100
# Cluster configuration
export CLUSTER_PORT=4210
export DISCOVERY_INTERVAL=1000
export HEALTH_CHECK_INTERVAL=1000
```
### PlatformIO Configuration
The project uses PlatformIO with the following configuration:
- **Framework**: Arduino
- **Board**: ESP-01 with 1MB flash
- **Upload Speed**: 115200 baud
- **Flash Mode**: DOUT (required for ESP-01S)
### Dependencies
The project requires the following libraries:
- `esp32async/ESPAsyncWebServer@^3.8.0` - HTTP API server
- `bblanchon/ArduinoJson@^7.4.2` - JSON processing
- `arkhipenko/TaskScheduler@^3.8.5` - Cooperative multitasking
## Development Workflow
### Building
Build the firmware for specific chip:
```bash
./ctl.sh build target esp01_1m
```
### Flashing
Flash firmware to a connected device:
```bash
./ctl.sh flash target esp01_1m
```
### Over-The-Air Updates
Update a specific node:
```bash
./ctl.sh ota update 192.168.1.100 esp01_1m
```
Update all nodes in the cluster:
```bash
./ctl.sh ota all esp01_1m
```
### Cluster Management
View cluster members:
```bash
./ctl.sh cluster members
```
## Troubleshooting
### Common Issues
1. **Discovery Failures**: Check UDP port 4210 is not blocked
2. **WiFi Connection**: Verify SSID/password in Config.cpp
3. **OTA Updates**: Ensure sufficient flash space (1MB minimum)
4. **Cluster Split**: Check network connectivity between nodes
### Debug Output
Enable serial monitoring to see cluster activity:
```bash
pio device monitor
```
### Performance Monitoring
- **Memory Usage**: Monitor free heap with `/api/node/status`
- **Task Health**: Check task status with `/api/tasks/status`
- **Cluster Health**: Monitor member status with `/api/cluster/members`
- **Network Latency**: Track response times in cluster data
## Related Documentation
- **[Task Management](./TaskManagement.md)** - Background task system
- **[API Reference](./API.md)** - REST API documentation
- **[TaskManager API](./TaskManager.md)** - TaskManager class reference
- **[OpenAPI Specification](../api/)** - Machine-readable API specification

437
docs/Development.md Normal file
View File

@@ -0,0 +1,437 @@
# Development & Deployment Guide
## Prerequisites
### Required Tools
- **PlatformIO Core** or **PlatformIO IDE**
- **ESP8266 development tools**
- **`jq`** for JSON processing in scripts
- **Git** for version control
### System Requirements
- **Operating System**: Linux, macOS, or Windows
- **Python**: 3.7+ (for PlatformIO)
- **Memory**: 4GB+ RAM recommended
- **Storage**: 2GB+ free space for development environment
## Project Structure
```
spore/
├── src/ # Source code
│ ├── main.cpp # Main application entry point
│ ├── ApiServer.cpp # HTTP API server implementation
│ ├── ClusterManager.cpp # Cluster management logic
│ ├── NetworkManager.cpp # WiFi and network handling
│ ├── TaskManager.cpp # Background task management
│ └── NodeContext.cpp # Central context and events
├── include/ # Header files
├── lib/ # Library files
├── docs/ # Documentation
├── api/ # OpenAPI specification
├── examples/ # Example code
├── test/ # Test files
├── platformio.ini # PlatformIO configuration
└── ctl.sh # Build and deployment scripts
```
## PlatformIO Configuration
### Framework and Board
The project uses PlatformIO with the following configuration:
```ini
[env:esp01_1m]
platform = platformio/espressif8266@^4.2.1
board = esp01_1m
framework = arduino
upload_speed = 115200
flash_mode = dout
```
### Key Configuration Details
- **Framework**: Arduino
- **Board**: ESP-01 with 1MB flash
- **Upload Speed**: 115200 baud
- **Flash Mode**: DOUT (required for ESP-01S)
- **Build Type**: Release (optimized for production)
### Dependencies
The project requires the following libraries:
```ini
lib_deps =
esp32async/ESPAsyncWebServer@^3.8.0
bblanchon/ArduinoJson@^7.4.2
arkhipenko/TaskScheduler@^3.8.5
ESP8266HTTPClient@1.2
ESP8266WiFi@1.0
```
## Building
### Basic Build Commands
Build the firmware for specific chip:
```bash
# Build for ESP-01 1MB
./ctl.sh build target esp01_1m
# Build for D1 Mini
./ctl.sh build target d1_mini
# Build with verbose output
pio run -v
```
### Build Targets
Available build targets:
| Target | Description | Flash Size |
|--------|-------------|------------|
| `esp01_1m` | ESP-01 with 1MB flash | 1MB |
| `d1_mini` | D1 Mini with 4MB flash | 4MB |
### Build Artifacts
After successful build:
- **Firmware**: `.pio/build/{target}/firmware.bin`
- **ELF File**: `.pio/build/{target}/firmware.elf`
- **Map File**: `.pio/build/{target}/firmware.map`
## Flashing
### Direct USB Flashing
Flash firmware to a connected device:
```bash
# Flash ESP-01
./ctl.sh flash target esp01_1m
# Flash D1 Mini
./ctl.sh flash target d1_mini
# Manual flash command
pio run --target upload
```
### Flash Settings
- **Upload Speed**: 115200 baud (optimal for ESP-01)
- **Flash Mode**: DOUT (required for ESP-01S)
- **Reset Method**: Hardware reset or manual reset
### Troubleshooting Flashing
Common flashing issues:
1. **Connection Failed**: Check USB cable and drivers
2. **Wrong Upload Speed**: Try lower speeds (9600, 57600)
3. **Flash Mode Error**: Ensure DOUT mode for ESP-01S
4. **Permission Denied**: Run with sudo or add user to dialout group
## Over-The-Air Updates
### Single Node Update
Update a specific node:
```bash
# Update specific node
./ctl.sh ota update 192.168.1.100 esp01_1m
# Update with custom firmware
./ctl.sh ota update 192.168.1.100 esp01_1m custom_firmware.bin
```
### Cluster-Wide Updates
Update all nodes in the cluster:
```bash
# Update all nodes
./ctl.sh ota all esp01_1m
```
### OTA Process
1. **Firmware Upload**: Send firmware to target node
2. **Verification**: Check firmware integrity
3. **Installation**: Install new firmware
4. **Restart**: Node restarts with new firmware
5. **Verification**: Confirm successful update
### OTA Requirements
- **Flash Space**: Minimum 1MB for OTA updates
- **Network**: Stable WiFi connection
- **Power**: Stable power supply during update
- **Memory**: Sufficient RAM for firmware processing
## Cluster Management
### View Cluster Status
```bash
# View all cluster members
./ctl.sh cluster members
# View specific node details
./ctl.sh cluster members --node 192.168.1.100
```
### Cluster Commands
Available cluster management commands:
| Command | Description |
|---------|-------------|
| `members` | List all cluster members |
| `status` | Show cluster health status |
| `discover` | Force discovery process |
| `health` | Check cluster member health |
### Cluster Monitoring
Monitor cluster health in real-time:
```bash
# Watch cluster status
watch -n 5 './ctl.sh cluster members'
# Monitor specific metrics
./ctl.sh cluster members | jq '.members[] | {hostname, status, latency}'
```
## Development Workflow
### Local Development
1. **Setup Environment**:
```bash
git clone <repository>
cd spore
pio run
```
2. **Make Changes**:
- Edit source files in `src/`
- Modify headers in `include/`
- Update configuration in `platformio.ini`
3. **Test Changes**:
```bash
pio run
pio check
```
### Testing
Run various tests:
```bash
# Code quality check
pio check
# Unit tests (if available)
pio test
# Memory usage analysis
pio run --target size
```
### Debugging
Enable debug output:
```bash
# Serial monitoring
pio device monitor
# Build with debug symbols
pio run --environment esp01_1m --build-flags -DDEBUG
```
## Configuration Management
### Environment Setup
Create a `.env` file in your project root:
```bash
# API node IP for cluster management
export API_NODE=192.168.1.100
```
### Configuration Files
Key configuration files:
- **`platformio.ini`**: Build and upload configuration
- **`src/Config.cpp`**: Application configuration
- **`.env`**: Environment variables
- **`ctl.sh`**: Build and deployment scripts
### Configuration Options
Available configuration options:
| Option | Default | Description |
|--------|---------|-------------|
| `CLUSTER_PORT` | 4210 | UDP discovery port |
| `DISCOVERY_INTERVAL` | 1000 | Discovery packet interval (ms) |
| `HEALTH_CHECK_INTERVAL` | 1000 | Health check interval (ms) |
| `API_SERVER_PORT` | 80 | HTTP API server port |
## Deployment Strategies
### Development Deployment
For development and testing:
1. **Build**: `pio run`
2. **Flash**: `pio run --target upload`
3. **Monitor**: `pio device monitor`
### Production Deployment
For production systems:
1. **Build Release**: `pio run --environment esp01_1m`
2. **OTA Update**: `./ctl.sh ota update <ip> esp01_1m`
3. **Verify**: Check node status via API
### Continuous Integration
Automated deployment pipeline:
```yaml
# Example GitHub Actions workflow
- name: Build Firmware
run: pio run --environment esp01_1m
- name: Deploy to Test Cluster
run: ./ctl.sh ota all esp01_1m --target test
- name: Deploy to Production
run: ./ctl.sh ota all esp01_1m --target production
```
## Monitoring and Debugging
### Serial Output
Enable serial monitoring:
```bash
# Basic monitoring
pio device monitor
# With specific baud rate
pio device monitor --baud 115200
# Filter specific messages
pio device monitor | grep "Cluster"
```
### API Monitoring
Monitor system via HTTP API:
```bash
# Check system status
curl -s http://192.168.1.100/api/node/status | jq '.'
# Monitor tasks
curl -s http://192.168.1.100/api/tasks/status | jq '.'
# Check cluster health
curl -s http://192.168.1.100/api/cluster/members | jq '.'
```
### Performance Monitoring
Track system performance:
```bash
# Memory usage over time
watch -n 5 'curl -s http://192.168.1.100/api/node/status | jq ".freeHeap"'
# Task execution status
watch -n 10 'curl -s http://192.168.1.100/api/tasks/status | jq ".summary"'
```
## Troubleshooting
### Common Issues
1. **Discovery Failures**: Check UDP port 4210 is not blocked
2. **WiFi Connection**: Verify SSID/password in Config.cpp
3. **OTA Updates**: Ensure sufficient flash space (1MB minimum)
4. **Cluster Split**: Check network connectivity between nodes
### Debug Commands
Useful debugging commands:
```bash
# Check network connectivity
ping 192.168.1.100
# Test UDP port
nc -u 192.168.1.100 4210
# Check HTTP API
curl -v http://192.168.1.100/api/node/status
# Monitor system resources
./ctl.sh cluster members | jq '.members[] | {hostname, status, resources.freeHeap}'
```
### Performance Issues
Common performance problems:
- **Memory Leaks**: Monitor free heap over time
- **Network Congestion**: Check discovery intervals
- **Task Overload**: Review task execution intervals
- **WiFi Interference**: Check channel and signal strength
## Best Practices
### Code Organization
1. **Modular Design**: Keep components loosely coupled
2. **Clear Interfaces**: Define clear APIs between components
3. **Error Handling**: Implement proper error handling and logging
4. **Resource Management**: Efficient memory and resource usage
### Testing Strategy
1. **Unit Tests**: Test individual components
2. **Integration Tests**: Test component interactions
3. **System Tests**: Test complete system functionality
4. **Performance Tests**: Monitor resource usage and performance
### Deployment Strategy
1. **Staged Rollout**: Deploy to test cluster first
2. **Rollback Plan**: Maintain ability to rollback updates
3. **Monitoring**: Monitor system health during deployment
4. **Documentation**: Keep deployment procedures updated
## Related Documentation
- **[Architecture Guide](./Architecture.md)** - System architecture overview
- **[Task Management](./TaskManagement.md)** - Background task system
- **[API Reference](./API.md)** - REST API documentation
- **[OpenAPI Specification](../api/)** - Machine-readable API specification

85
docs/README.md Normal file
View File

@@ -0,0 +1,85 @@
# SPORE Documentation
This folder contains comprehensive documentation for the SPORE embedded system.
## Available Documentation
### 📖 [API.md](./API.md)
Complete API reference with detailed endpoint documentation, examples, and integration guides.
**Includes:**
- API endpoint specifications
- Request/response examples
- HTTP status codes
- Integration examples (Python, JavaScript)
- Task management workflows
- Cluster monitoring examples
### 📖 [TaskManager.md](./TaskManager.md)
Comprehensive guide to the TaskManager system for background task management.
**Includes:**
- Basic usage examples
- Advanced binding techniques
- Task status monitoring
- API integration details
- Performance considerations
### 📖 [TaskManagement.md](./TaskManagement.md)
Complete guide to the task management system with examples and best practices.
**Includes:**
- Task registration methods (std::bind, lambdas, functions)
- Task control and lifecycle management
- Remote task management via API
- Performance considerations and best practices
- Migration guides and compatibility information
### 📖 [Architecture.md](./Architecture.md)
Comprehensive system architecture and implementation details.
**Includes:**
- Core component descriptions
- Auto discovery protocol details
- Task scheduling system
- Event system architecture
- Resource monitoring
- Performance characteristics
- Security and scalability considerations
### 📖 [Development.md](./Development.md)
Complete development and deployment guide.
**Includes:**
- PlatformIO configuration
- Build and flash instructions
- OTA update procedures
- Cluster management commands
- Development workflow
- Troubleshooting guide
- Best practices
## Quick Links
- **Main Project**: [../README.md](../README.md)
- **OpenAPI Specification**: [../api/](../api/)
- **Source Code**: [../src/](../src/)
## Contributing
When adding new documentation:
1. Create a new `.md` file in this folder
2. Use clear, descriptive filenames
3. Include practical examples and code snippets
4. Update this README.md to reference new files
5. Follow the existing documentation style
## Documentation Style Guide
- Use clear, concise language
- Include practical examples
- Use code blocks with appropriate language tags
- Include links to related documentation
- Use emojis sparingly for visual organization
- Keep README.md files focused and scoped

348
docs/TaskManagement.md Normal file
View File

@@ -0,0 +1,348 @@
# Task Management System
The SPORE system includes a comprehensive TaskManager that provides a clean interface for managing system tasks. This makes it easy to add, configure, and control background tasks without cluttering the main application code.
## Overview
The TaskManager system provides:
- **Easy Task Registration**: Simple API for adding new tasks with configurable intervals
- **Dynamic Control**: Enable/disable tasks at runtime
- **Interval Management**: Change task execution frequency on the fly
- **Status Monitoring**: View task status and configuration
- **Automatic Lifecycle**: Tasks are automatically managed and executed
## Basic Usage
```cpp
#include "TaskManager.h"
// Create task manager
TaskManager taskManager(ctx);
// Register tasks
taskManager.registerTask("heartbeat", 2000, heartbeatFunction);
taskManager.registerTask("maintenance", 30000, maintenanceFunction);
// Initialize and start all tasks
taskManager.initialize();
```
## Task Registration Methods
### Using std::bind with Member Functions (Recommended)
```cpp
#include <functional>
#include "TaskManager.h"
class MyService {
public:
void sendHeartbeat() {
Serial.println("Service heartbeat");
}
void performMaintenance() {
Serial.println("Running maintenance");
}
};
MyService service;
TaskManager taskManager(ctx);
// Register member functions using std::bind
taskManager.registerTask("heartbeat", 2000,
std::bind(&MyService::sendHeartbeat, &service));
taskManager.registerTask("maintenance", 30000,
std::bind(&MyService::performMaintenance, &service));
// Initialize and start all tasks
taskManager.initialize();
```
### Using Lambda Functions
```cpp
// Register lambda functions directly
taskManager.registerTask("counter", 1000, []() {
static int count = 0;
Serial.printf("Count: %d\n", ++count);
});
// Lambda with capture
int threshold = 100;
taskManager.registerTask("monitor", 5000, [&threshold]() {
if (ESP.getFreeHeap() < threshold) {
Serial.println("Low memory warning!");
}
});
```
### Complex Task Registration
```cpp
class NetworkManager {
public:
void checkConnection() { /* ... */ }
void sendData(String data) { /* ... */ }
};
NetworkManager network;
// Multiple operations in one task
taskManager.registerTask("network_ops", 3000,
std::bind([](NetworkManager* net) {
net->checkConnection();
net->sendData("status_update");
}, &network));
```
## Task Control API
### Basic Operations
```cpp
// Enable/disable tasks
taskManager.enableTask("heartbeat");
taskManager.disableTask("maintenance");
// Change intervals
taskManager.setTaskInterval("heartbeat", 5000); // 5 seconds
// Check status
bool isRunning = taskManager.isTaskEnabled("heartbeat");
unsigned long interval = taskManager.getTaskInterval("heartbeat");
// Print all task statuses
taskManager.printTaskStatus();
```
### Task Lifecycle Management
```cpp
// Start/stop tasks
taskManager.startTask("heartbeat");
taskManager.stopTask("discovery");
// Bulk operations
taskManager.enableAllTasks();
taskManager.disableAllTasks();
```
## Task Configuration Options
When registering tasks, you can specify:
- **Name**: Unique identifier for the task
- **Interval**: Execution frequency in milliseconds
- **Callback**: Function, bound method, or lambda to execute
- **Enabled**: Whether the task starts enabled (default: true)
- **AutoStart**: Whether to start automatically (default: true)
```cpp
// Traditional function
taskManager.registerTask("delayed_task", 5000, taskFunction, true, false);
// Member function with std::bind
taskManager.registerTask("service_task", 3000,
std::bind(&Service::method, &instance), true, false);
// Lambda function
taskManager.registerTask("lambda_task", 2000,
[]() { Serial.println("Lambda!"); }, true, false);
```
## Adding Custom Tasks
### Method 1: Using std::bind (Recommended)
1. **Create your service class**:
```cpp
class SensorService {
public:
void readTemperature() {
// Read sensor logic
Serial.println("Reading temperature");
}
void calibrateSensors() {
// Calibration logic
Serial.println("Calibrating sensors");
}
};
```
2. **Register with TaskManager**:
```cpp
SensorService sensors;
taskManager.registerTask("temp_read", 1000,
std::bind(&SensorService::readTemperature, &sensors));
taskManager.registerTask("calibrate", 60000,
std::bind(&SensorService::calibrateSensors, &sensors));
```
### Method 2: Traditional Functions
1. **Define your task function**:
```cpp
void myCustomTask() {
// Your task logic here
Serial.println("Custom task executed");
}
```
2. **Register with TaskManager**:
```cpp
taskManager.registerTask("my_task", 10000, myCustomTask);
```
## Enhanced TaskManager Capabilities
### Task Status Monitoring
- **Real-time Status**: Check enabled/disabled state and running status
- **Performance Metrics**: Monitor execution intervals and timing
- **System Integration**: View task status alongside system resources
- **Bulk Operations**: Get status of all tasks at once
### Task Control Features
- **Runtime Control**: Enable/disable tasks without restart
- **Dynamic Intervals**: Change task execution frequency on-the-fly
- **Individual Status**: Get detailed information about specific tasks
- **Health Monitoring**: Track task health and system resources
## Remote Task Management
The TaskManager integrates with the API server to provide comprehensive remote task control and monitoring.
### Task Status Overview
Get a complete overview of all tasks and system status:
```bash
# Get comprehensive task status
curl http://192.168.1.100/api/tasks/status
```
**Response includes:**
- **Summary**: Total task count and active task count
- **Task Details**: Individual status for each task (name, interval, enabled, running, auto-start)
- **System Info**: Free heap memory and uptime
**Example Response:**
```json
{
"summary": {
"totalTasks": 6,
"activeTasks": 5
},
"tasks": [
{
"name": "discovery_send",
"interval": 1000,
"enabled": true,
"running": true,
"autoStart": true
},
{
"name": "heartbeat",
"interval": 2000,
"enabled": true,
"running": true,
"autoStart": true
}
],
"system": {
"freeHeap": 48748,
"uptime": 12345
}
}
```
### Individual Task Control
Control individual tasks with various actions:
```bash
# Control tasks
curl -X POST http://192.168.1.100/api/tasks/control \
-d "task=heartbeat&action=disable"
# Get detailed status for a specific task
curl -X POST http://192.168.1.100/api/tasks/control \
-d "task=discovery_send&action=status"
```
**Available Actions:**
- `enable` - Enable a task
- `disable` - Disable a task
- `start` - Start a task
- `stop` - Stop a task
- `status` - Get detailed status for a specific task
**Task Status Response:**
```json
{
"success": true,
"message": "Task status retrieved",
"task": "discovery_send",
"action": "status",
"taskDetails": {
"name": "discovery_send",
"enabled": true,
"running": true,
"interval": 1000,
"system": {
"freeHeap": 48748,
"uptime": 12345
}
}
}
```
## Performance Considerations
- `std::bind` creates a callable object that may have a small overhead compared to direct function pointers
- For high-frequency tasks, consider the performance impact
- The overhead is typically negligible for most embedded applications
- The TaskManager stores bound functions efficiently in a registry
## Best Practices
1. **Use std::bind for member functions**: Cleaner than wrapper functions
2. **Group related tasks**: Register multiple related operations in a single task
3. **Monitor task health**: Use the status API to monitor task performance
4. **Plan intervals carefully**: Balance responsiveness with system resources
5. **Use descriptive names**: Make task names clear and meaningful
## Migration from Wrapper Functions
### Before (with wrapper functions):
```cpp
void discoverySendTask() { cluster.sendDiscovery(); }
void discoveryListenTask() { cluster.listenForDiscovery(); }
taskManager.registerTask("discovery_send", interval, discoverySendTask);
taskManager.registerTask("discovery_listen", interval, discoveryListenTask);
```
### After (with std::bind):
```cpp
taskManager.registerTask("discovery_send", interval,
std::bind(&ClusterManager::sendDiscovery, &cluster));
taskManager.registerTask("discovery_listen", interval,
std::bind(&ClusterManager::listenForDiscovery, &cluster));
```
## Compatibility
- The new `std::bind` support is fully backward compatible
- Existing code using function pointers will continue to work
- You can mix both approaches in the same project
- All existing TaskManager methods remain unchanged
- New status monitoring methods are additive and don't break existing functionality
## Related Documentation
- **[TaskManager API Reference](./TaskManager.md)** - Detailed API documentation
- **[API Reference](./API.md)** - REST API for remote task management
- **[OpenAPI Specification](../api/)** - Machine-readable API specification

View File

@@ -1,180 +0,0 @@
# TaskManager
## Basic Usage
### Including Required Headers
```cpp
#include <functional> // For std::bind
#include "TaskManager.h"
```
### Registering Member Functions
```cpp
class MyClass {
public:
void myMethod() {
Serial.println("My method called");
}
void methodWithParams(int value, String text) {
Serial.printf("Method called with %d and %s\n", value, text.c_str());
}
};
// Create an instance
MyClass myObject;
// Register member function
taskManager.registerTask("my_task", 1000,
std::bind(&MyClass::myMethod, &myObject));
// Register method with parameters
taskManager.registerTask("param_task", 2000,
std::bind(&MyClass::methodWithParams, &myObject, 42, "hello"));
```
### Registering Lambda Functions
```cpp
// Simple lambda
taskManager.registerTask("lambda_task", 3000, []() {
Serial.println("Lambda executed");
});
// Lambda with capture
int counter = 0;
taskManager.registerTask("counter_task", 4000, [&counter]() {
counter++;
Serial.printf("Counter: %d\n", counter);
});
// Lambda that calls multiple methods
taskManager.registerTask("multi_task", 5000, [&myObject]() {
myObject.myMethod();
// Do other work...
});
```
### Registering Global Functions
```cpp
void globalFunction() {
Serial.println("Global function called");
}
// Still supported for backward compatibility
taskManager.registerTask("global_task", 6000, globalFunction);
```
## Advanced Examples
### Binding to Different Object Types
```cpp
class NetworkManager {
public:
void sendHeartbeat() { /* ... */ }
void checkConnection() { /* ... */ }
};
class SensorManager {
public:
void readSensors() { /* ... */ }
void calibrate() { /* ... */ }
};
NetworkManager network;
SensorManager sensors;
// Bind to different objects
taskManager.registerTask("heartbeat", 1000,
std::bind(&NetworkManager::sendHeartbeat, &network));
taskManager.registerTask("sensor_read", 500,
std::bind(&SensorManager::readSensors, &sensors));
```
### Using std::placeholders for Complex Binding
```cpp
#include <functional>
class ConfigManager {
public:
void updateConfig(int interval, bool enabled) {
Serial.printf("Updating config: interval=%d, enabled=%d\n", interval, enabled);
}
};
ConfigManager config;
// Use placeholders for complex parameter binding
using namespace std::placeholders;
taskManager.registerTask("config_update", 10000,
std::bind(&ConfigManager::updateConfig, &config, _1, _2));
```
### Conditional Task Execution
```cpp
class TaskController {
public:
bool shouldExecute() {
return millis() % 10000 < 5000; // Execute only in first 5 seconds of each 10-second cycle
}
void conditionalTask() {
if (shouldExecute()) {
Serial.println("Conditional task executed");
}
}
};
TaskController controller;
taskManager.registerTask("conditional", 1000,
std::bind(&TaskController::conditionalTask, &controller));
```
## Benefits of Using std::bind
1. **Cleaner Code**: No need for wrapper functions
2. **Direct Binding**: Bind member functions directly to objects
3. **Parameter Passing**: Easily pass parameters to bound methods
4. **Lambda Support**: Use lambdas for complex logic
5. **Type Safety**: Better type checking than function pointers
6. **Flexibility**: Mix and match different callable types
## Migration from Wrapper Functions
### Before (with wrapper functions):
```cpp
void discoverySendTask() { cluster.sendDiscovery(); }
void discoveryListenTask() { cluster.listenForDiscovery(); }
taskManager.registerTask("discovery_send", interval, discoverySendTask);
taskManager.registerTask("discovery_listen", interval, discoveryListenTask);
```
### After (with std::bind):
```cpp
taskManager.registerTask("discovery_send", interval,
std::bind(&ClusterManager::sendDiscovery, &cluster));
taskManager.registerTask("discovery_listen", interval,
std::bind(&ClusterManager::listenForDiscovery, &cluster));
```
## Performance Considerations
- `std::bind` creates a callable object that may have a small overhead compared to direct function pointers
- For high-frequency tasks, consider the performance impact
- The overhead is typically negligible for most embedded applications
- The TaskManager stores bound functions efficiently in a registry
## Compatibility
- The new `std::bind` support is fully backward compatible
- Existing code using function pointers will continue to work
- You can mix both approaches in the same project
- All existing TaskManager methods remain unchanged