feat: mock gateway
This commit is contained in:
198
docs/monitoring-example.md
Normal file
198
docs/monitoring-example.md
Normal file
@@ -0,0 +1,198 @@
|
||||
# Monitoring Resources Endpoint
|
||||
|
||||
## Overview
|
||||
|
||||
The `/api/monitoring/resources` endpoint provides comprehensive real-time resource monitoring for all nodes in the cluster.
|
||||
|
||||
## Endpoint
|
||||
|
||||
```
|
||||
GET /api/monitoring/resources
|
||||
```
|
||||
|
||||
## Response Format
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2025-10-24T10:30:45Z",
|
||||
"nodes": [
|
||||
{
|
||||
"timestamp": 1729763445,
|
||||
"node_ip": "192.168.1.100",
|
||||
"hostname": "spore-node-1",
|
||||
"cpu": {
|
||||
"frequency_mhz": 160,
|
||||
"usage_percent": 42.5,
|
||||
"temperature_c": 58.3
|
||||
},
|
||||
"memory": {
|
||||
"total_bytes": 98304,
|
||||
"free_bytes": 45632,
|
||||
"used_bytes": 52672,
|
||||
"usage_percent": 53.6
|
||||
},
|
||||
"network": {
|
||||
"bytes_sent": 3245678,
|
||||
"bytes_received": 5678901,
|
||||
"packets_sent": 32456,
|
||||
"packets_received": 56789,
|
||||
"rssi_dbm": -65,
|
||||
"signal_quality_percent": 75.5
|
||||
},
|
||||
"flash": {
|
||||
"total_bytes": 4194304,
|
||||
"used_bytes": 2097152,
|
||||
"free_bytes": 2097152,
|
||||
"usage_percent": 50.0
|
||||
},
|
||||
"labels": {
|
||||
"version": "1.0.0",
|
||||
"stable": "true",
|
||||
"env": "production",
|
||||
"zone": "zone-1",
|
||||
"type": "spore-node"
|
||||
}
|
||||
}
|
||||
],
|
||||
"summary": {
|
||||
"total_nodes": 5,
|
||||
"avg_cpu_usage_percent": 38.7,
|
||||
"avg_memory_usage_percent": 51.2,
|
||||
"avg_flash_usage_percent": 52.8,
|
||||
"total_bytes_sent": 16228390,
|
||||
"total_bytes_received": 28394505
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Data Fields
|
||||
|
||||
### CPU Metrics
|
||||
- **frequency_mhz**: Current CPU frequency in MHz (80-240 MHz typical for ESP32)
|
||||
- **usage_percent**: CPU utilization percentage (0-100%)
|
||||
- **temperature_c**: CPU temperature in Celsius (45-65°C typical)
|
||||
|
||||
### Memory Metrics
|
||||
- **total_bytes**: Total RAM available (64-128 KB typical)
|
||||
- **free_bytes**: Free RAM available
|
||||
- **used_bytes**: Used RAM
|
||||
- **usage_percent**: Memory utilization percentage
|
||||
|
||||
### Network Metrics
|
||||
- **bytes_sent**: Total bytes transmitted since boot
|
||||
- **bytes_received**: Total bytes received since boot
|
||||
- **packets_sent**: Total packets transmitted
|
||||
- **packets_received**: Total packets received
|
||||
- **rssi_dbm**: WiFi signal strength in dBm (-30 to -90 typical)
|
||||
- **signal_quality_percent**: WiFi signal quality (0-100%)
|
||||
|
||||
### Flash Metrics
|
||||
- **total_bytes**: Total flash storage (typically 4MB)
|
||||
- **used_bytes**: Used flash storage
|
||||
- **free_bytes**: Free flash storage
|
||||
- **usage_percent**: Flash utilization percentage
|
||||
|
||||
### Node Labels
|
||||
Each node includes labels that match firmware versions:
|
||||
- **version**: Current firmware version (e.g., "1.0.0", "1.1.0", "1.2.0")
|
||||
- **stable**: Whether this is a stable release ("true" or "false")
|
||||
- **env**: Environment (e.g., "production", "beta")
|
||||
- **zone**: Deployment zone (e.g., "zone-1", "zone-2", "zone-3")
|
||||
- **type**: Node type (e.g., "spore-node")
|
||||
|
||||
### Summary Statistics
|
||||
Aggregate metrics across all nodes:
|
||||
- **total_nodes**: Total number of nodes monitored
|
||||
- **avg_cpu_usage_percent**: Average CPU usage across all nodes
|
||||
- **avg_memory_usage_percent**: Average memory usage across all nodes
|
||||
- **avg_flash_usage_percent**: Average flash usage across all nodes
|
||||
- **total_bytes_sent**: Combined network traffic sent
|
||||
- **total_bytes_received**: Combined network traffic received
|
||||
|
||||
## Firmware Version Matching
|
||||
|
||||
Node labels are automatically synchronized with the firmware available in the registry:
|
||||
|
||||
| Version | Registry Status | Node Distribution | Environment |
|
||||
|---------|----------------|-------------------|-------------|
|
||||
| 1.0.0 | Stable | 40% of nodes | production |
|
||||
| 1.1.0 | Stable | 40% of nodes | production |
|
||||
| 1.2.0 | Beta | 20% of nodes | beta |
|
||||
|
||||
This ensures that monitoring data accurately reflects which firmware versions are deployed across the cluster.
|
||||
|
||||
## Use Cases
|
||||
|
||||
### 1. Real-time Dashboard
|
||||
Display live resource usage for all nodes in a monitoring dashboard.
|
||||
|
||||
### 2. Alerting
|
||||
Set up alerts based on thresholds:
|
||||
- CPU usage > 80%
|
||||
- Memory usage > 90%
|
||||
- Flash usage > 95%
|
||||
- WiFi signal quality < 30%
|
||||
|
||||
### 3. Capacity Planning
|
||||
Track resource trends to plan firmware optimizations or hardware upgrades.
|
||||
|
||||
### 4. Firmware Rollout Monitoring
|
||||
Monitor resource usage before, during, and after firmware rollouts to detect issues.
|
||||
|
||||
### 5. Network Health
|
||||
Track WiFi signal quality and network traffic to identify connectivity issues.
|
||||
|
||||
## Example Usage
|
||||
|
||||
### cURL
|
||||
```bash
|
||||
curl http://localhost:3001/api/monitoring/resources
|
||||
```
|
||||
|
||||
### JavaScript (fetch)
|
||||
```javascript
|
||||
const response = await fetch('http://localhost:3001/api/monitoring/resources');
|
||||
const data = await response.json();
|
||||
|
||||
console.log(`Monitoring ${data.summary.total_nodes} nodes`);
|
||||
console.log(`Average CPU: ${data.summary.avg_cpu_usage_percent.toFixed(1)}%`);
|
||||
console.log(`Average Memory: ${data.summary.avg_memory_usage_percent.toFixed(1)}%`);
|
||||
|
||||
data.nodes.forEach(node => {
|
||||
console.log(`${node.hostname} (${node.labels.version}): CPU ${node.cpu.usage_percent.toFixed(1)}%`);
|
||||
});
|
||||
```
|
||||
|
||||
### Python
|
||||
```python
|
||||
import requests
|
||||
|
||||
response = requests.get('http://localhost:3001/api/monitoring/resources')
|
||||
data = response.json()
|
||||
|
||||
print(f"Monitoring {data['summary']['total_nodes']} nodes")
|
||||
print(f"Average CPU: {data['summary']['avg_cpu_usage_percent']:.1f}%")
|
||||
print(f"Average Memory: {data['summary']['avg_memory_usage_percent']:.1f}%")
|
||||
|
||||
for node in data['nodes']:
|
||||
print(f"{node['hostname']} ({node['labels']['version']}): "
|
||||
f"CPU {node['cpu']['usage_percent']:.1f}%")
|
||||
```
|
||||
|
||||
## Mock Gateway Behavior
|
||||
|
||||
The mock gateway generates realistic monitoring data with:
|
||||
- **Dynamic values**: CPU, memory, and network metrics vary on each request
|
||||
- **Realistic ranges**: Values stay within typical ESP32 hardware limits
|
||||
- **Signal quality**: WiFi RSSI converted to quality percentage
|
||||
- **Consistent labels**: Node labels always match firmware registry versions
|
||||
- **Aggregate summaries**: Automatic calculation of cluster-wide statistics
|
||||
|
||||
## Integration with WebSocket
|
||||
|
||||
For real-time updates, consider combining this endpoint with the WebSocket connection at `/ws` which broadcasts:
|
||||
- Node status changes
|
||||
- Firmware update progress
|
||||
- Cluster membership changes
|
||||
|
||||
The monitoring endpoint provides detailed point-in-time snapshots, while WebSocket provides real-time event streams.
|
||||
Reference in New Issue
Block a user