Inspiration
Seeing a major problem of separation anxiety and its adverse mental health effects, we wanted to develop a tool people could use to have genuine conversations with loved ones they were separated from.
What it does
All a user needs to do is collect 1 minute of audio from the person with whom they want to speak to and upload it to the website. From the audio, a voice clone is created and the user can have real-time audio conversations with that person.
How we built it
We used React and Material-UI to design the front-end. The backend was powered by express and connects to multiple APIs. For voice replication, we used ElevenLabs API. For speech-to-text translation, we used Google Cloud. For conversation generation, we used Chat-GPT API powered by GPT-3.5. All user authentication was done using Firebase, and the unique voice ids for those which they want to talk to was stored in Cloud Firestore.
Challenges we ran into
As this was our first time using and learning React and full stack in javascript, we had a lot of challenges designing a smooth front-end. We also had major problems sending a file, specifically of audio data, from the front-end to the backend to be processed.
Accomplishments that we're proud of
We are proud that we were able to create a mostly functional application that mostly targets the goal we had in mind. We were not entirely sure we would even get halfway.
What we learned
We learned a lot about how to integrate a full-stack project and some of the good practices that go along with one. Although API connection was not specifically tricky, we learned quickly how useful it can be and even learned how to host a Flask API.
What's next for Convi
We hope to smooth things out and create a more responsive and user-friendly front-end. We may even host the website on a github domain.
Built With
- axios
- elevenlabs
- express.js
- firebase
- google-cloud
- gpt-3.5
- react
Log in or sign up for Devpost to join the conversation.