Files

2025-10-19 14:05:00 +02:00

6.5 KiB

Raw Blame History

UDP Auto Discovery Implementation

Overview

The backend has been successfully updated to implement UDP auto discovery, eliminating the need for hardcoded IP addresses. The system now automatically discovers SPORE nodes on the network and dynamically configures the SporeApiClient.

What Was Implemented

1. UDP Discovery Server

Port: 4210 (configurable via UDP_PORT constant)
Message: CLUSTER_HEARTBEAT (configurable via HEARTBEAT_MESSAGE constant)
Protocol: UDP broadcast listening
Auto-binding: Automatically binds to the specified port on startup

2. Dynamic Node Management

Automatic Discovery: Nodes are discovered when they send CLUSTER_HEARTBEAT messages
Primary Node Selection: The most recently seen node becomes the primary connection
Stale Node Cleanup: Nodes not seen for 5+ minutes are automatically removed
Health Monitoring: Continuous monitoring of node availability

3. SporeApiClient Integration

Dynamic IP Configuration: Client is automatically configured with discovered node IPs
No Hardcoded IPs: All IP addresses are now discovered dynamically
Automatic Failover: System automatically switches to available nodes
Error Handling: Graceful handling when no nodes are available

4. New API Endpoints

Discovery Management

GET /api/discovery/nodes - View all discovered nodes and status
POST /api/discovery/refresh - Manually trigger discovery refresh
POST /api/discovery/primary/:ip - Manually set primary node

Health Monitoring

GET /api/health - Comprehensive health check including discovery status

5. Enhanced Error Handling

Service Unavailable: Returns 503 when no nodes are discovered
Graceful Degradation: System continues to function even when nodes are unavailable
Detailed Error Messages: Clear feedback about discovery status

How It Works

1. Startup Sequence

1. Backend starts and binds UDP server to port 4210
2. HTTP server starts on port 3001
3. System waits for CLUSTER_HEARTBEAT messages
4. When messages arrive, nodes are automatically discovered
5. SporeApiClient is configured with the first discovered node

2. Discovery Process

1. Node sends "CLUSTER_HEARTBEAT:hostname" to 255.255.255.255:4210
2. Backend receives message and extracts source IP
3. Node is added to discovered nodes list
4. If no primary node exists, this becomes the primary
5. SporeApiClient is automatically configured with the new IP

3. Node Management

1. All discovered nodes are tracked with timestamps
2. Primary node is the most recently seen node
3. Stale nodes (5+ minutes old) are automatically removed
4. System automatically switches primary node if current becomes stale

Configuration

Environment Variables

PORT: HTTP server port (default: 3001)
UDP_PORT: UDP heartbeat port (default: 4210)

Constants (in index.js)

UDP_PORT: Heartbeat port (currently 4210)
HEARTBEAT_MESSAGE: Expected message (currently "CLUSTER_HEARTBEAT")
Stale timeout: 5 minutes (configurable in cleanupStaleNodes())
Health check interval: 5 seconds (configurable in setInterval)

Usage Examples

Starting the Backend

npm start

Testing Discovery

# Send discovery message to broadcast
npm run test-discovery broadcast

# Send to specific IP
npm run test-discovery 192.168.1.100

# Send multiple messages
npm run test-discovery broadcast 5

Monitoring Discovery

# Watch discovery in real-time
npm run demo-discovery

Using the Client

# Use discovery system
npm run client-example

# Direct connection (for testing)
npm run client-example 192.168.1.100

API Response Examples

Discovery Status

{
  "primaryNode": "192.168.1.100",
  "totalNodes": 2,
  "nodes": [
    {
      "ip": "192.168.1.100",
      "port": 4210,
      "discoveredAt": "2024-01-01T12:00:00.000Z",
      "lastSeen": "2024-01-01T12:05:00.000Z",
      "isPrimary": true
    }
  ],
  "clientInitialized": true,
  "clientBaseUrl": "http://192.168.1.100"
}

Health Check

{
  "status": "healthy",
  "timestamp": "2024-01-01T12:05:00.000Z",
  "services": {
    "http": true,
    "udp": true,
    "sporeClient": true
  },
  "discovery": {
    "totalNodes": 2,
    "primaryNode": "192.168.1.100",
    "udpPort": 4210,
    "serverRunning": true
  }
}

Benefits

1. Zero Configuration

No need to manually configure IP addresses
Automatic discovery of all nodes on the network
Self-healing when nodes come and go

2. High Availability

Automatic failover to available nodes
No single point of failure
Continuous health monitoring

3. Scalability

Supports unlimited number of nodes
Automatic load distribution
Easy to add/remove nodes

4. Maintenance

No manual IP updates required
Automatic cleanup of stale nodes
Comprehensive monitoring and logging

Troubleshooting

Common Issues

No Nodes Discovered

Check if backend is running: curl http://localhost:3001/api/health
Verify UDP port is open: Check firewall settings
Send test discovery message: npm run test-discovery broadcast

UDP Port Already in Use

Check for other instances: netstat -tulpn | grep 4210
Kill conflicting processes or change port in code
Restart backend server

Client Not Initialized

Check discovery status: curl http://localhost:3001/api/discovery/nodes
Verify nodes are sending discovery messages
Check network connectivity

Debug Commands

# Check discovery status
curl http://localhost:3001/api/discovery/nodes

# Check health
curl http://localhost:3001/api/health

# Manual refresh
curl -X POST http://localhost:3001/api/discovery/refresh

# Set primary node
curl -X POST http://localhost:3001/api/discovery/primary/192.168.1.100

Future Enhancements

Potential Improvements

Node Prioritization: Weight-based node selection
Load Balancing: Distribute requests across multiple nodes
Authentication: Secure discovery messages
Metrics: Detailed performance and health metrics
Configuration: Runtime configuration updates
Clustering: Multiple backend instances with shared discovery

Conclusion

The UDP auto discovery implementation provides a robust, scalable solution for dynamic node management. It eliminates manual configuration while providing high availability and automatic failover capabilities. The system is production-ready and includes comprehensive monitoring, error handling, and debugging tools.

6.5 KiB Raw Blame History