Inspiration

Imagine the hustle and bustle at the start of a new semester, with the prospect of tackling multiple classes and navigating through heaps of unorganized information: lecture schedules, grading policies, and, of course, the crucial assignment schedules. With around 5-6 classes to manage, each accompanied by its own set of syllabus PDFs and links, the thought of manually inputting deadlines looms ominously, potentially consuming hours that could be better spent settling in, catching up with friends, or simply enjoying some downtime.

Enter Task Harvest, our straightforward web app designed to streamline your academic life in just three simple steps:

  1. Upload a PDF File: Transfer your syllabus materials effortlessly.
  2. Wait for a few seconds: Let Task Harvest do the heavy lifting.
  3. Receive a downloadable link to your email with the necessary CSV: Your essential task data, neatly organized and ready for use!

It's as straightforward as that! With the CSV file easily importable into Excel and various task management software, Task Harvest helps you save valuable time that can be better utilized for the things that truly matter.

What it does

It takes in the syllabus as an input in the form of a pdf, this pdf reaches our google cloud storage, the pdf from the google cloud storage is then read through a google cloud serverless function on the backend, with prompts the models with specific prompt engineered prompts for such tasks using the OpenAI interfaces. It takes in the output from the Generative AI model used from OpenAI, and writes the output in the form of a CSV. This CSV is written back to the google cloud storage, and then through a third party API called Sendgrid we send this attachment of a csv to your email; in just a few clicks of buttons :)

How we built it

Started off by setting up a basic frontend which can accept a pdf representing the syllabus to be uploaded. Locally working on code utilizing LLMs and Artificial intelligence to produce a csv from the syllabus pdf. Prompt engineering to fine-tune the resulting csv for the most optimal results! Setting up Google Cloud storage and establishing communication between frontend and cloud storage to allow uploading pdf. Developing serverless Google Cloud functions to trigger the code utilizing LLMs and Artificial intelligence and storing csv in Google Cloud storage. Sending the generated csv to the user on their email using Sendgrid and serverless Google Cloud functions.

Challenges we ran into

  1. Challenges and Learning curve of Google Cloud : Both of us in our team had zero exposure to cloud technologies, so picking it up and utilizing it to build a working prototype in the short of span of 24 hours was a very rewarding, yet challenging experience. Our progress was especially stinted because we wanted to use some more complex features offer by Google Cloud like using Firebase functions to directly upload the image from the frontend to the Cloud Storage. We were enthusiastic about doing this as we wanted to run the least amount of things locally and follow best practices like scalability. However utilizing these seemed detrimental to our efficiency as we spent 4+ hours on this. Learnt lesson and moved on eventually and realized that it's okay to have some code running locally given the tight time constraint of the hackathon and we don't need to build ready to ship software during one!
  2. Challenges in Prompt Tuning: We noticed that some syllabus pdf's were easier to parse more than others, due to this we needed to modify the prompts a lot more for it to be generic.

Accomplishments that we're proud of

As a 2 person team, getting a working end-to-end prototype to the model is a great accomplishment for us. We're also very proud of diving deep and utilizing the full might of Google cloud. Considering none of us had any exposure to building on the cloud, we're glad we got to learn so much about the cloud while building a great project! We're also excited about putting cutting-edge tech like LLMs to use in our project! We tried different prompts for the openAI models we were considering and based on our human understanding, we prompt engineered it to output something similar to what we want! This in itself was very rewarding looking at the outputs similar to what we want from the AI!

What we learned

Learning the behind the scenes of working with Google Cloud like reading/writing to Cloud Storage and utilizing the power of Serverless computing via Cloud Functions! Also, realizing the benefit of harnessing a platform like Google Cloud which provides so many different services which can easily communicate and connect together. This allows for rapid development in parallel among us.

What's next for TaskHarvest

  1. Allowing multiple Syllabus Pdf's as input, so as to get a combined csv output so we dont need to email multiple csv's.
  2. Creating a check to determine whether a syllabus has the information trying to be extracted from the Link! If not, allow the user to specify which page has the information could be extremely helpful!
  3. Allowing online URL links as inputs to the FrontEnd
  4. Beautify the UI more to make it more accessible
  5. We could use a similar technology to manage schedules, to create and download csv's for lectures from syllabus, the difference between using our csv created from the syllabus, is instead of only lecture number in the lecture name in our calendar event, we will also notice the lecture content expected to be taught in the course on different days.
  6. Improve the prompt even more by making it more generalizable to newer formats of syllabus
Share this project:

Updates