Inspiration

We noticed that, when holding club meetings, having a secretary write down meeting minutes was rather inconvenient. It's impossible to keep up with the pace of several speaking people, and very stressful for the designated stenographer. We wanted to leverage speech-to-text technology to be able to automate meeting minutes for us.

How it works

One person can go to www.konverse.me and create a room - then, she can send the link to the other attendees of the meeting to join the room. After each person enables his/her microphone, the web app will begin to display a transcript of the meeting minutes on the screen: this includes a timestamp, the speaker, and the spoken text.

Challenges we ran into

A major challenge was that the speech-to-text API's we could find, while working rather well for a single speaker, probably wouldn't be well enough to accurately document an entire conversation, especially when multiple people were speaking once. We would need a more advanced API, that provides information like speech volume, frequency, etc., so that we can algorithmically decide who the loudest speaker is, and whose conversations to display on screen at any one time.

Accomplishments that we're proud of

We were surprised at how we were able to integrate the speech-to-texted API so seamlessly for multi-room chats - we imagined this would comprise a large part of our technical challenge throughout the hackathon.

What we learned

During this hackathon, we got exposure to several new technologies, including Firebase and the HTML5 speech-to-text API - we hope to use these in the future to build more innovative, interesting products!

What's next for Konverse?

There are an infinite number of extensions for such a product - some of which are currently in development. For example, we are working on a feature that enables someone to download a text file of the meeting minutes - this would be useful for club/company records, etc. In addition, given the large amount of textual data we have, we can leverage several machine learning/sentiment analysis API's to provide useful information about the conversation dynamics. As a simple example, we worked in integrating Indico's API to our project, so that we can display a "happiness" measurement of the meeting as it progresses - however, because Indico's API was not functioning this morning, we were not able to test our code.

In addition, as previously stated, managing speech-to-text with several simultaneous speakers is a large technical challenge - hopefully with the use of a better API, we can have more quantitative factors to run a more detailed analysis. The possibilities are endless!

Share this project:

Updates