Inspiration

In a world where education has become increasingly remote and reliant on online platforms, we need human connection more than ever. Many students often find it difficult to express their feelings without unmuting themselves and drawing unwanted attention. As a result, teachers are unaware of how their students are feeling and if the material is engaging. This situation is especially challenging for students who struggle with communicating their feelings–such as individuals with autism, selective mutism, social anxiety, and more.

We want to help bridge this gap by creating a tool that will both enable students to express themselves with less effort and enable teachers to understand and respond to their overall needs.

We strongly believe in the importance of accessibility in education and supplementing human connection, because at the end of the day, humans are all social beings.

What it does

Our application helps measure the general emotions of participants in a video meeting, displaying a stream of emojis representing up to 80 different emotions. We periodically sample video frames from all participants with their cameras on at 10-second intervals, feeding this data into Hume’s Expression Management API to identify the most prominent expressions. From this, we generate a composite view of the general sentiment using a custom weighted algorithm.

Using this aggregated sentiment data, our frontend displays the most frequent emotions with their corresponding emojis on the screen. This way, hosts can adapt their teaching to the general sentiment of the classroom, while students can share how they’re feeling without having to experience the social anxiety that comes with typing a message in the chat or sharing a thought out loud.

How we built it

We leveraged LiveKit to create our video conference infrastructure and Vercel to deploy our application. We also utilized Supabase Realtime as our communication protocol, forwarding livestream data from clients per room and saving that data to Supabase Storage.

Our backend, implemented with FastAPI, interfaces with the frontend to pull this data from Supabase and feed the captured facial data into Hume AI to detect human emotions.

The results are then aggregated and stored back into our Supabase table. Our frontend, built with Next.js and styled with Tailwind CSS, listens to real-time event triggers from Supabase to detect changes in the table.

From this, we’re able to display the stream of emotions in near real-time, finally delivering aggregated emotion data as a light-hearted fun animation to keep everyone engaged!

Challenges we ran into

  • Livekit Egress has limited documentation
  • Coordination of different parts using Supabase Realtime
  • Hume AI API
  • First-time Frontenders
  • Hosting our backend thru Vercel (lots of config)

Accomplishments that we're proud of

  • Livekit real time streaming video conference
  • Streaming video data to Hume Supabase Realtime
  • Emoji animation using Framer Motion
  • Efficient scoring algorithm using heaps

What we learned

We learned how to use a lot of new tools and frameworks such as Next.js and Supabase as it was some of our members' first time doing full-stack software engineering. From our members all the way from SoCal and the East Coast, we learned how to ride the BART, and we all learned LiveKit for live streaming and video conferencing.

What's next for Moji

We see the potential of this tool in a wide variety of industries and have other features in mind that we want to implement. For example, we can focus on enhancing this tool to help streamers with any kind of virtual audience by:

  • Implementing a dynamic checklist that generates to-dos based on questions or requests from viewers.

This can benefit teachers in providing efficient learning to their studies or large entertainment streamers in managing a fast-moving chat. This can also be extended to eCommerce, as livestream shopping requires sellers to efficiently navigate their chat interactions.

  • Using Whisper for real-time audio speech recognition to automatically check off answered questions.

This provides a hands-free way for streamers to meet their viewers’ requests without having to look extensively through chat. This is especially beneficial for the livestream shopping industry as sellers are typically displaying items while reading messages

  • Using RAG to store answers to previously asked questions and using this data to answer any future questions.

This can be a great way to save time for streamers from answering repeated questions. Enhancing video recognition capabilities to identify more complex interactions and objects in real-time. With video recognition, we can lean even heavier into the eCommerce industry, identifying what type of products sellers are displaying and providing a hands-free and AI enhanced way of managing their checklist of requests.

  • Adding integrations with other streaming platforms to broaden its applicability and improve the user experience.

The possibilities are endless and we’re excited to see where Moji can go! We hope that Moji can bring a touch of humanity and help us all stay connected and engaged in the digital world.

Built With

Share this project:

Updates