Overview
π WakeMate is an AI-powered, real-time drowsiness detection system designed to prevent fatigue-related accidents before they happen. Using just a webcam, WakeMate leverages computer vision and a custom-trained deep learning model to track a driverβs facial activity and detect signs of drowsiness with high accuracy. But WakeMate goes beyond just detection β when it senses youβre falling asleep, it initiates a conversational alert system powered by Gemini and ElevenLabs, speaking directly to the driver with friendly, context-aware suggestions like taking a break, grabbing a coffee, or playing music. It's lightweight, privacy-conscious, and doesnβt require any extra hardware. Whether you're a long-haul trucker, rideshare driver, or everyday commuter β WakeMate acts as your intelligent co-pilot, keeping you alert, engaged, and safe on the road.
π‘ Inspiration
Driver fatigue is a silent killer on the road, contributing to thousands of accidents every year. We wanted to create a solution that doesn't just detect drowsiness but actively intervenes in a helpful, human-like way β something that feels more like a co-pilot than a tool. WakeMate was born from the idea of combining real-time computer vision with conversational AI to keep drivers alert, engaged, and ultimately, safe.
π οΈ How We Built It
We started by training a Convolutional Neural Network (CNN) using eye state data from Kaggle to classify open vs. closed eyes.
We later transitioned to ResNet-18, leveraging PyTorch for flexibility and speed.
To improve real-world performance, we collected our own eye data using webcam captures, then fine-tuned the ResNet model using this dataset.
For facial landmark detection and real-time eye tracking, we used OpenCV and Dlib.
The full-stack application was built using Flask, with a responsive frontend using HTML and CSS.
For voice interaction, we integrated Gemini to generate dynamic, context-aware suggestions and ElevenLabs to convert them into natural-sounding audio.
π§± Challenges We Ran Into
Early on, our model struggled to correctly differentiate between open and closed eyes β especially under varying lighting conditions and different eye shapes. This inconsistency in predictions significantly affected real-time performance. After several iterations, we decided to collect and label our own custom dataset using webcam input. Fine-tuning the existing ResNet-18 architecture on this personalized data led to a dramatic improvement in accuracy and reliability, ultimately enabling WakeMate to perform well across diverse conditions.
π Accomplishments That We're Proud Of
Weβre incredibly proud of how much we achieved in just 24 hours. This was the first time any of us had built a full-stack web application β let alone one integrated with a fine-tuned AI model, real-time video analysis, and voice-based conversational features. Every team member contributed to building something complete, functional, and meaningful. Seeing WakeMate go from idea to working prototype was a huge milestone.
π What We Learned
How to train and fine-tune deep learning models using PyTorch
Real-time facial landmark tracking and eye detection using OpenCV and Dlib
How to collect and preprocess custom datasets for model improvement
Integrating APIs like Gemini and ElevenLabs to deliver voice feedback
Full-stack development using Flask, HTML/CSS, and local model inference
Working collaboratively under time constraints to bring a vision to life
π What's Next for WakeMate
Deploy as a desktop app or browser extension for widespread accessibility
Add voice input detection (e.g., wake words or conversation flow)
Expand to include fatigue pattern analytics over time
Support for multi-language voice feedback
Collaborate with automotive APIs or hardware for real-world integrations
Further fine-tune the model on a larger, more diverse dataset to boost robustness


Log in or sign up for Devpost to join the conversation.