Hands In
3D Avatar
Learning the Alphabet!

HandsIn

In 2019, my mom was diagnosed with low-level hearing loss. Over the next year, her hearing rapidly declined. Trips to the grocery store, dinners out, and even doctor's appointments became a two-person job for her, as she began to rely more heavily on a third person to repeat conversations. It wasn't until three years later when she got her hearing aids, that life felt back to 'normal'. But I could never seem to forget that period in my life, when this slight inconvenience so largely impacted everyday life. In the following years, my mother became an advocate for people with disabilities, using her own experience as a platform to connect with others. As an onlooker to the community she found, I began to learn more about the deaf and hard of hearing community myself. But as I communicated with more and more people who used ASL as their native language, I found one big issue: everything is focused on translating ASL into English, and not on understanding ASL.

Not only can this come off as ignorant, but it has real life consequences too. Emergency providers cannot communicate with their patients, even for just basic needs. Elderly homes isolate native signers. In classrooms, deaf children who use ASL often have their language minimized or overlooked entirely, with teachers and interpreters prioritizing English-based instruction. This not only hinders their academic growth but also sends the message that their native language — and by extension, their identity — is less valuable.

We offer a platform where healthcare providers, students, and emergency responders can train with an AI-powered avatar who communicates through American Sign Language (ASL). Unlike traditional simulations, our AI avatar adapts dynamically to user input, allowing for spontaneous conversations. By offering an inclusive, flexible training environment, we aim to better prepare providers for real-world scenarios where language and cultural understanding are critical to delivering high-quality care.

How We Built It

We came in with a plan: classify signs from a user, translate to ASL gloss, and then generate a video of an avatar signing back to the user. This seemingly simple three-step plan quickly delved into manually scraping databases to get videos, taking almost 6000 individual photos for training a model, and automating typical user tasks with a Python script.

All front-end development was done with Streamlit, a Python web framework we found at the Hackathon. We deployed our project onto the cloud using Streamlit’s Cloud Service and bought a domain from GoDaddy.

There are two main parts of the backend of our project: the Library Learning Paths and the AI Avatar.

The Library Learning Paths were built off of a 3-input Artificial Neural Network real-time object detector one of our members had made previously in Python. We scaled it to detect 24 letters of the alphabet — j and z being excluded as they include movement in their detection — and created a lesson that captures an image of the user and identifies the sign they are making, providing feedback if it is identified as the wrong sign.

We built on this real-time detection by creating a program that took in English sentences and phrases and inputted them into a chatbot, which returns an appropriate response to the user’s input. We then utilized the Gemini AI API to translate the English returned from the chatbot into ASL Gloss (a system for approximating ASL in written form).

The ASL Gloss is then parsed and searched through a Gloss-to-Video database, concatenated back together, and inputted into a video-to-3D Avatar generator. That 3D Avatar Generator is then shown back to the user on loop.

Challenges We Ran Into

Until we found the perfect program — or what we thought was perfect. It was able to almost seamlessly convert our videos to an avatar. Yet when we tried to sign up for API access, we were met with yet another challenge: being such a small company, they personally approved each API access request. It being a Saturday and us only having 12 hours left in the Hackathon, we lost hope.

We went back to the drawing board, scraping the internet hoping we had overlooked that one perfect program. But our search was meticulous and once again we found ourselves back at that earlier program, this time with a new strategy — automating the process of a user signing in, uploading the video we've created previously in the pipeline, and downloading the generated video, all with a Python script. So that’s what we did.

Accomplishments That We're Proud Of

Not only did we integrate LLMs into our project, but we also built our own Neural Network to classify the user’s hand movements.

Besides working with AI, we really put the ‘Hack’ in Hackathon, finding a way to automate the login and download process, even when the API was not available at the moment.

What We Learned

Built during our first hackathon, we are proud of what HandsIn has come to be. From utilizing new tools like the Python web framework Streamlit, Gemini AI API calls, and buying our first domain, we have come a long way from the people we were only 36 hours before.

At many times the project seemed unconquerable — I mean if it didn’t, it would have already been made — then there would be a small breakthrough or moment of clarity that brought us all together at one computer in a brief moment of celebration.

We have learned more than just technical skills though. Talking with sponsor representatives and other hackers showed us a hidden Hackathon bonus: community.

We hope to continue to grow this product, including creating more lessons focusing on industry-specific topics like healthcare, education, etc. Naturally, we want to keep building the model we have trained in order to recognize as many signs as we have data for, which means lots and lots of data.

While we’re at it, we think it would be beneficial to both the Hearing and Deaf communities to include lessons on the history and culture of American Sign Language and the development of signing and gesturing throughout history.

Built With

css
deepmotion
firebase
gemini
joblib
mediapipe
opencv
pil
python
pytorch
scikit-learn
streamlit
streamlitcloud

Submitted to

Hacktech by Caltech 2025

Created by

I worked mainly on the front-end and integrating the back-end into the front-end; however, as this was a startup, everyone was working on everything that they could help with. This was my first time using Firebase and Streamlit; however, I learned a lot and improved my Python skills in the process.

Defne Meric Erdogan
grace Janusz
Elena Loucks
Nusret Efe Ucer

Updates

grace Janusz started this project — Apr 27, 2025 11:53 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.