MotionMentor

Inspiration

As young adults stepping into our fitness journey, we encountered a major problem. Although we were all motivated to start working out and lifting weights, we knew that doing so without proper form could cause injuries, deterring us from improving our physical health. Searching online, we were bombarded with tons of conflicting advice, and hiring personal trainers is expensive. After discussing among our peers, we identified this as a significant barrier across people of all demographics, old and young.

Inspired by the cycle of life’s track mission to improve a step in a person’s life journey, we developed a neural network interface to improve user’s workout forms and provide live feedback. Our app aids in preventing injuries, ensures users are confident about the exercises they are doing, and encourages them to become the best versions of themselves and live healthy lives.

What it does

Motion Mentor operates by analyzing a live camera feed of users performing exercises, as shown in our presentation video. During the recording, our program employs a mediapipe pose detection model, seamlessly integrated into our framework. This model extracts precise 3D pose coordinates, capturing various joints on the user's body.

To evaluate form, we developed quantifications for specific angles and body proportions. For instance, if the ratio between the distance of the user's feet and hips is deemed too small, Motion Mentor promptly notifies the user to adjust their stance, preventing potential strain on their knees.

Our demo video showcases the effectiveness of Motion Mentor by illustrating four common form mistakes. The application actively detects these errors and provides live feedback, offering users an interactive experience to correct and enhance their exercise techniques. This real-time guidance aims not only to prevent injuries but also to elevate overall performance, making Motion Mentor an invaluable tool in your fitness journey.

How we built it

This application was developed in a Python Jupyter notebook environment, using the following technologies:

OpenCV for Live Video Feeds: We utilized the OpenCV module to collect live-time video feeds. This technology enables the application to capture and process real-time video data.

Mediapipe API for Pose Detection: The mediapipe API was configured to cater to our specific use case, enabling precise pose detection. Frame by frame, the application feeds data into the mediapipe API, which runs the pose detection model and outputs precise coordinates.

Data Processing with Numpy: Using numpy vector functions, such as dot products, we calculated distances between points and angles. The dot product theorem facilitated the precise calculation of angles between specific points in the user's pose.

Logic Implementation for Mistake Identification: We implemented logic to identify user mistakes by analyzing the calculated angles and measurements. The application constantly updates in real time, displaying relevant angles and measurements during live testing. It records value ranges indicative of proper form for each angle/measurement.

Real-time Feedback: As the application runs, it provides live feedback by identifying and displaying mistakes on the video feed. This ensures that feedback is accurate, timely, and consistent throughout the exercise.

Challenges we ran into

3D Pose Coordination Requirement: Initially, our approach involved utilizing 2D pose coordinates with the camera facing directly in front of the user. However, we soon recognized that this setup lacked sufficient information for making accurate recommendations. For example, determining if a user is bending down too much requires 3D coordinates of the hip, knee, and ankle. Exploring various API options, we were fascinated by the Mediapipe API's implementation of the GHUM model, which converts 2D poses into a 3D reconstruction of the person. Overall, the GHUM model provides the essential data needed for our product to function accurately.
Addressing Camera Angle Challenges: Another significant challenge arose in dealing with varying camera angles. To ensure our product's usability and overcome this issue, we transitioned to the Mediapipe API. The API's ability to provide 3D coordinates irrespective of the camera holder's angle significantly simplified our development process and effectively addressed the challenge posed by different camera perspectives.

Accomplishments that we're proud of

Exposure to Computer Vision & In-depth Understanding of GHUM and BlazePose: Since this was our first time exploring pose detection, we were exposed to the fascinating realm of computer vision technology. It was interesting to learn about the intricacies of the GHUM and BlazePose papers, unraveling the inner workings of these advanced algorithms. Understanding how these technologies contribute to accurate pose detection significantly broadened our knowledge base.
Hands-on Experience with OpenCV and Image Processing: Embarking on this project afforded us the opportunity to enhance our skills in OpenCV and image processing. We successfully applied these tools to capture and process live video feeds, adding a practical dimension to our theoretical knowledge.

What we learned

Gained more experience with Jupyter Notebooks Learned Mediapipe API Data processing and using multi-stage machine learning frameworks Using NumPy for vector operations Visualization and Analysis of Model Outputs

What's next for MotionMentor

While we are proud of the progress we have made, we eventually want to host our application on a cloud-based service for anyone to access. Potentially using AWS or HuggingFace could be a great way to power our application.

A feature we started working on but didn’t get to complete was counting reps. Once this product is deployed, we can potentially track reps of users during a workout and collect other biometric information to display a personalized dashboard. It would be really interesting to integrate our application with wearable technology, such as Apple watches to gain additional databases of user information to work with.