Project Story: Smart Audio Navigation System

Inspiration:

The inspiration for this project came from observing the challenges faced by visually impaired individuals navigating unfamiliar spaces. While developing assistive technologies, we noticed that many existing solutions rely heavily on expensive hardware or complex setups that aren't easily accessible or customizable.

We envisioned a system that could:

  • Provide real-time auditory feedback based on proximity to objects
  • Be highly customizable with different audio cues for different users
  • Leverage modern web technologies for an intuitive configuration interface
  • Use affordable, off-the-shelf components that anyone could assemble

Our goal was to create a proof-of-concept that demonstrates how embedded systems, web technologies, and AI can converge to create accessible, user-friendly assistive devices.

What We Learned:

Hardware

  • Implementing real-time sensor fusion with ESP32, combining ultrasonic distance sensors, RFID readers, and servo motors
  • Optimizing audio playback on constrained hardware using ESP32's DAC (Digital-to-Analog Converter)
  • Utilizing Web Serial API for direct browser-to-hardware communication

Audio Processing

  • Converting various audio formats (MP3, WAV) to raw PCM data suitable for ESP32 DAC playback
  • Generating .h header files with audio sample arrays for embedded C/C++ compilation
  • Balancing audio quality with memory constraints (downsampling, bit depth reduction)

Web Development

  • Building a responsive dashboard using Next.js and TypeScript
  • Integrating ElevenLabs API for AI-generated voice and sound effects
  • Creating an intuitive UI with distance-based audio configuration

Soft Skills

  • We learned to break down a complex multi-disciplinary project into manageable components: hardware assembly, firmware development, audio processing pipeline, web interface, and cloud storage.
  • Troubleshooting issues that spanned hardware wiring, Arduino C++ code, TypeScript web applications, and real-time serial communication taught us systematic debugging approaches.

Architecture Overview

Our system consists of four main components:

  1. ESP32 Firmware (Arduino)
  2. Web Dashboard (Next.js)
  3. Audio Processing Pipeline
  4. Cloud Storage (MongoDB)

Hardware Setup

  • ESP32 Development Board (brain of the system)
  • HC-SR04 Ultrasonic Sensor (distance measurement)
  • MFRC522 RFID Reader (user identification)
  • SG90 Servo Motor (sweeping sensor motion)
  • PAM8403 Audio Amplifier + Speaker (audio output)
  • SSD1306 OLED Display (status feedback)

Wiring & Integration: We carefully connected all components to the ESP32, mapping GPIO pins for the ultrasonic sensor (Trig: 33, Echo: 32), RFID (SPI interface), servo (PWM on pin 25), and DAC audio output (GPIO 26).

Firmware Development

The ESP32 firmware handles:

  • RFID card scanning for user identification
  • Ultrasonic distance measurement (0-250cm range)
  • Servo-controlled sensor sweeping
  • Distance-based audio trigger selection
  • DAC audio playback from SPIFFS
  • Serial communication protocol with web dashboard
  • Persistent settings storage using Preferences API

Key Challenges Solved:

  • Non-blocking sensor reads to prevent audio stuttering
  • Efficient memory management for audio buffers
  • Robust serial protocol for settings transfer

Web Dashboard

Built with modern web technologies:

Tech Stack:

  • Next.js 14 (App Router)
  • TypeScript
  • Tailwind CSS + shadcn/ui components
  • Web Serial API for ESP32 communication
  • ElevenLabs API integration

Features:

  • Custom Audio Upload (MP3/WAV files)
  • Tone Generator (custom frequency buzzer sounds)
  • AI Audio Generation (TTS and sound effects via ElevenLabs)
  • Visual Distance Range Configuration (0-250cm)
  • Direct ESP32 Connection via USB
  • Cloud Settings Storage per User

Audio Processing Pipeline

  1. User uploads MP3/WAV or generates custom tone
  2. Decode audio to PCM samples
  3. Downsample to 16kHz (ESP32 DAC optimal rate)
  4. Convert to 8-bit unsigned format
  5. Generate C header files with sample arrays
  6. Transfer to ESP32 via Web Serial
  7. Store in SPIFFS filesystem

Distance-Based Triggering

Users configure zones (e.g., 0-50cm, 50-100cm, 100-200cm) and assign different audio cues to each. The ESP32:

  1. Continuously measures distance
  2. Checks which zone the measurement falls into
  3. Plays the corresponding audio file
  4. Switches seamlessly when crossing zone boundaries

Real-Time Visualization

The web dashboard includes a live preview mode that:

  • Animates a slider moving from 0 to 250cm
  • Plays the configured audio for each zone as the slider passes through
  • Provides immediate feedback on audio assignments
  • Syncs playback timing perfectly with visual position

Challenges We Faced

1. Audio Playback Quality

Initial audio playback was choppy and distorted due to blocking operations. To fix this, we implemented non-blocking distance measurement with timeouts and optimized DAC write timing for smooth playback.

2. Memory Constraints

The ESP32 has limited RAM (~520KB) and SPIFFS storage, but audio files can be large. Our compression downsampled all audio to 16kHz mono and reduced bit depth to 8 bits, and streams audio from SPIFFS instead of loading into RAM.

3. Cross-Domain Debugging

Challenge: Bugs could originate from hardware, firmware, web app, or the interface between them. We added extensive serial debug logging and implemented a systematic debugging method to test individual and integrated components.

4. Saving User Settings

Each user needs their own custom audio and range settings stored on ESP32. We implemented RFID-based user identification, using per-user settings stored in both cloud (MongoDB) and locally.

Future Improvements

  • Multiple sensor support for 360° coverage
  • Machine learning for obstacle classification
  • Bluetooth audio output support
  • Mobile app companion
  • Battery power optimization
  • 3D-printed enclosure design
  • Integration with smartphone GPS for outdoor navigation

Built With

Share this project:

Updates