Inspiration
We had a participant who did a Optical Character Recognition for a project in a class and it got us interested in it when mentioned, we knew the Hackathon had to do with AI so we instantly though of using the ChatGPT API.
What it does
It translates images from YouTube videos into information about said image, processing it in such a way to make it more accessible to people of all knowledge levels.
How we built it
We implemented AWS Textract to read imagedata from the youtube video and interpreted it through OpenAI's ChatGPT-turbo. We then installed it into a chrome extensions built in javascript that modifies Youtube Dom for eas access.
Challenges we ran into
On the backend, we challenged ourselves to chain together various LLM APIs in a way that was fast and clean.
Accomplishments that we're proud of
We feel that this is a tool with real utility and direct application for students.
What's next for VideoTextReader
We want to add features that display questions based on the transcript to get a full understanding of whats being presented. We also wanted to allow users to greater specify the study guide. Whether its tailoring the depth of explanation or focusing exclusively on summarization or practice guide.
Built With
- amazon-web-services
- gpt-3.5
- html5
- javascript
- langchain
Log in or sign up for Devpost to join the conversation.