NoVox

Inspiration

43 million people worldwide are blind or visually impaired, facing daily navigation challenges that limit their independence. Guide dogs cost $50,000+ and canes only detect ground obstacles, while smartphones have powerful cameras sitting unused. We saw an opportunity to transform these devices into intelligent navigation assistants while building companion mode so families can support their loved ones without compromising their independence.

What it does

NaVox transforms smartphones into AI-powered navigation assistants for blind and visually impaired users. It provides real-time object detection (people, vehicles, doors, tables), smart audio guidance that prioritizes critical alerts, proximity warnings with distance-based intensity, and pathway analysis using depth maps. Users get unique share codes that allow family members to connect as companions with three privacy levels. The entire experience is accessibility-first with full VoiceOver support, audio-centric UI, and haptic feedback.

How we built it

The iOS app uses Swift/SwiftUI with YOLOv8n CoreML for object detection and Google Gemini Vision API for contextual scene understanding every 5-8 seconds. VisionService orchestrates camera capture while specialized services handle audio (AVSpeechSynthesizer with smart grouping), haptics (CoreHaptics with distance-based intensity), and navigation (priority-based cue routing). The authentication system includes beautiful gradient UIs for user and companion login with QR code scanning. Our Python FastAPI backend handles user registration, share code validation, and companion requests, persisting everything to Snowflake with five tables: USERS, COMPANION_RELATIONSHIPS, CONNECTION_REQUESTS, JOURNEYS, and NAVIGATION_EVENTS. We implemented a hybrid AI approach (local YOLO for speed, cloud Gemini for intelligence), privacy-first design with on-device processing, and offline fallback with unique code generation using collision detection.

Challenges we ran into

Audio announcement overload made our first version unusable with constant repetition ("chair right, chair right, chair right"). We built smart grouping that batches detections and a priority queue ensuring vehicles interrupt everything. Real-time performance issues from running YOLO, depth analysis, haptics, and speech simultaneously caused battery drain we solved this by profiling with Instruments, threading optimization, and throttling Gemini to every 8 seconds. Distance estimation without LiDAR required a hybrid approach using LiDAR when available, Vision framework depth maps as fallback, and bounding box estimation. Database foreign key constraints failed when companions weren't pre-registered, requiring schema restructuring. iOS-Backend model synchronization needed consistent naming conventions and explicit CodingKeys for snake_case/camelCase conversion. Share code uniqueness at scale required Snowflake queries before assignment with retry logic for the rare collisions.

Accomplishments that we're proud of

We built a production-ready architecture with authentication, database persistence, error handling, and offline fallback not just a hackathon demo. Our smart audio UX solves the announcement overload problem plaguing most accessibility apps. The hybrid AI approach successfully balances local YOLO with cloud Gemini (contextual intelligence). Privacy-respecting companion mode provides granular permissions (3 sharing modes, 4 permission types) so users stay in control. End-to-end integration from iOS to FastAPI to Snowflake works seamlessly. Accessible-first design with full VoiceOver support, haptic feedback, and audio-centric UI makes this genuinely usable by our target audience. The unique share code system (NV-A7B9-C2D4 format) avoids confusing characters and supports QR scanning. Clean MVVM architecture with proper service separation ensures maintainable, scalable code.

What we learned

We mastered CoreML optimization for mobile including buffer management, thread prioritization, and memory limits. Accessibility development taught us audio-first mental models beyond just adding button labels. Hybrid AI architecture showed us that cloud isn't always better local inference can be faster and more private. Snowflake integration covered foreign keys, VARIANT JSON columns, and indexing strategies. Swift Concurrency with async/await and MainActor improved our modern Swift skills. Design-wise, we learned less information is more users need relevant detections at the right time, not everything. Timing matters critically in audio UI where 2-second delays feel broken. Haptics complement audio by creating spatial awareness without sound. Privacy isn't binary granular permissions let users choose their comfort level. Most importantly, we understood that independence and safety aren't opposites, that 43 million people struggle with navigation we take for granted, and that accessibility features benefit everyone, not just people with disabilities.

What's next for NoVox

Near-term: Beta testing with blind community partners, voice commands via Siri Shortcuts, Apple Watch haptic guidance, and enhanced offline mode. Mid-term: Indoor navigation with ARKit, saved routes with waypoints, multi-language support (Spanish, Mandarin, Hindi, Arabic), and community route sharing. Long-term: Healthcare partnerships for professional training integration, predictive navigation using ML for optimal routes, smart city integration with traffic signals and transit data, sustainability models offering free/subsidized access, and hardware partnerships with smart glasses and wearables. Our vision is making NaVox the standard for accessible navigation worldwide, proving that technology can empower people without compromising independence, privacy, or dignity.