FleetPulse: Proactive AMR Fleet Command Center
Elevator Pitch
- Real-time AI monitoring for autonomous robots that predicts failures before they happen, visualizes issues on a live map, and alerts operators instantly.
Inspiration
- A robotics deployment engineer in a Singapore warehouse told us robots generate 500MB+ of logs daily, yet failure diagnosis remains manual and slow.
- Engineers still scroll through 90,000+ lines of CSV to spot a single battery voltage spike — futuristic robots, legacy tools.
- In hospitals and airports deploying heterogeneous fleets (cleaning, delivery, security), this blind spot is risky. We built a “Single Pane of Glass” that turns raw, noisy telemetry into an intuitive, real-time, predictive command center.
How We Built It
- Streaming-friendly ingestion and analysis, end-to-end:
- Frontend: Next.js 14 + React 18 + TypeScript; Tailwind CSS; Leaflet.js map; Recharts for trends; WebSocket client for instant updates.
- Backend: FastAPI + Uvicorn; Pydantic models for validation; REST ingestion; WebSocket broadcast; per-robot IsolationForest for anomaly detection; health scoring and RUL estimation.
- Simulation: Python fleet simulator generates realistic telemetry and injects controlled failures to test detection and alerting.
- Notifications: Telegram Bot API for Markdown-formatted critical alerts.
Streaming ETL Concepts
- We treat telemetry as streams, not static files. Data is processed in chunks, with rolling windows for normalization and trend analysis.
- Z-Score Normalization (for sensor noise filtering):
$$ z = \frac{x - \mu}{\sigma} $$
- (x): current reading; (\mu): rolling mean (e.g., last 60s); (\sigma): rolling std. If (|z| > 3), flag a critical anomaly.
Architecture Overview
- Client → Presentation → Intelligence → Simulation → Notification
- Flow:
- Simulator → FastAPI (/telemetry) → Health/ML → WebSocket broadcast → Next.js dashboard
- Critical conditions → Telegram alert
Data Engineering and ML
- Ingestion:
- Validates telemetry and maintains short rolling histories per robot.
- Normalizes signals and aggregates for efficient visualization.
- Anomaly detection:
- IsolationForest (unsupervised) per robot on multivariate features (battery, temperature, CPU, velocity), producing an anomaly score and risk level.
- Health scoring:
- Compresses multivariate signals into one actionable metric and color-coded status.
- RUL (Remaining Useful Life):
- Estimates time-to-critical threshold from recent trend slopes (e.g., battery decay).
Challenges We Faced
- Real-time consistency and concurrency:
- Coordinating REST ingestion and WebSocket broadcasting without race conditions or stale views.
- Pattern: clear state boundaries, short-lived buffers, and event-driven broadcasts.
- Browser performance with high-frequency data:
- Rendering thousands of points can stutter.
- Solution: server-side aggregation and event-level summaries; selective UI updates.
- Environment compatibility:
- Binary wheels for NumPy/SciPy/Sklearn on Windows; careful version pinning for Python.
- Alert fatigue vs responsiveness:
- Thresholds and hysteresis reduce noisy alerts while keeping operators informed.
ROS 2 / Open-RMF Roadmap
- Current backend is FastAPI with simulated telemetry; roadmap includes ROS 2 ingestion via rclpy and rmf_fleet_msgs FleetState subscriptions.
- Bridge pattern:
- ROS thread pushes messages to a thread-safe queue.
- API/alerting workers consume from the queue without blocking.
What We Learned
- Open-RMF interoperability:
- Standardizing on rmf_fleet_msgs enables vendor-agnostic monitoring across facilities.
- UX as a safety feature:
- Clear health bars and pulsing alerts reduce cognitive load under stress more than raw error codes.
- Data engineering > model complexity:
- Robust pipelines (validation, normalization, aggregation) matter more than complex models for reliability.
Math Notes (LaTeX)
- IsolationForest risk mapping:
$$ r = \sigma!\left(\beta \, (s_{\text{thr}} - s(x))\right), \quad \sigma(z) = \frac{1}{1 + e^{-z}} $$
- Health Score (illustrative):
$$ H = 100 - \alpha_b \, \phi(b) - \alpha_T \, \phi(T) - \alpha_C \, \phi(C) - \alpha_v \, \phi(|v - \bar{v}|) $$
- Battery RUL:
$$ \text{RUL}{\text{hours}} \approx \frac{b - b{\text{crit}}}{\left|\frac{db}{dt}\right|} \quad \text{for } \frac{db}{dt} < 0 $$
Built With
- Languages
- TypeScript, JavaScript, Python
- Frontend
- Next.js 14, React 18, Tailwind CSS, Leaflet.js, Recharts
- Backend
- FastAPI, Uvicorn, Pydantic
- ML
- Scikit-Learn (IsolationForest), NumPy, SciPy, Joblib, Threadpoolctl
- Realtime & Transport
- WebSocket, REST API
- Notifications
- Telegram Bot API (Markdown alerts)
- Platforms
- Local development on Windows; dashboard at http://localhost:3000/ and API at http://localhost:8000/
- Data & Storage
- In-memory rolling buffers for demo; roadmap:Nginx/HAProxy (load balancing)
Why It Matters
- Demonstrates complete engineering: frontend, backend, ML, real-time ops.
- Highly demoable: green-to-red map, health drop, instant phone alert.
- Practical value: reduces downtime, speeds triage, supports predictive maintenance.
- Clear path to scale: auth/TLS, distributed state, historical analytics, Open-RMF integration.
Quick Start (Local)
- Backend: run FastAPI
python -m uvicorn backend:app --host 0.0.0.0 --port 8000 - Frontend: run Next.js
npm run devthen open http://localhost:3000/ - Optional simulation
python sim_fleet.pyto stream telemetry and trigger live updates
Built With
- api
- bot
- buffers
- fastapi
- in-memory
- javascript
- joblib
- leaflet.js
- next.js-14
- numpy
- planned:
- postgresql
- pydanticscikit-learn-(isolationforest)
- python
- react-18
- redis
- rest
- scipy
- state
- tailwind-css
- telegram
- threadpoolctlwebsocket-(server-and-client)
- three.js
- typescript
- uvicorn
Log in or sign up for Devpost to join the conversation.