Cerestial

Cerestial Flowchart
Cerestial PC
Cerestial Mobile 1
Cerestial Mobile 2
Cerestial UI/UX planning board

Inspiration

While researching, we found a troubling fact that hits very close to home. We found that Illinois, as a farming state, faces very unpredictable weather patterns. Sudden droughts, heavy rainfall, and early frosts are all part of this bigger problem. Farmers, however, don’t have immediate access to real-time agricultural data and weather forecasts–at least not long-term ones. They also lack the knowledge of mitigation strategies and up-to-date cultivation methods studied to confront the climate challenge. Therefore, it is difficult for farmers to plan ahead and tackle challenges that what locals call “Illinois weather” brings to their fields.

Farmers in Illinois confront interconnected challenges shaped by weather variability and climate challenges - locals usually called “Illinois weather”. In 2009, Goldblum found that corn and soybean yields show sensitivity to changes in temperature and precipitation. Unfortunately, as the term “Illinois weather” suggests, temperature, sunshine duration, precipitation, and wind speeds in Illinois need technical methods and machines - which farmers cannot afford - to be observed accurately. Tomasek et al. (2017) also predict that climate change will expand early-spring field workability while reducing it in late spring and intensifying summer drought risk. However, a recent study in 2019 by Marks and Boerngen indicated that farmers, though increasingly concerned about nutrient loss, remain uncertain of precise mitigation strategies and up-to-date cultivation methods. Therefore, it is necessary for farmers to have access to real-time data and cultivation method updates to confront the climate challenges and plan ahead to ensure a high production rate.

What it does

Our chatbot utilizes an AI model to redirect queries to two specialized models to help answer user questions. If it needs additional data, the site scours the internet for information related to the user’s question and attempts to provide details as accurately as possible to assist the user in understanding the necessary information for their farm development and planning.

How we built it

We started by researching various methods to tackle our problem. After experimenting with multiple AI models and technologies, we decided to implement a Retrieval-Augmented Generation (RAG) model, enhanced by the BM25 algorithm to search and rank relevant websites and resources. Additionally, we incorporated a memory mechanism to improve user satisfaction by maintaining context throughout the conversation.

To enhance the consistency, accuracy, and relevance of responses, we introduced a custom context file. Alongside this, we continuously retrieve real-time weather data with a one-week forecast to provide more informed answers.

Once our model and data preparation were complete, we moved on to building the backend system. We selected Flask as our framework to develop the back end in Python and implemented the necessary RESTful APIs to support front end interactions. Meanwhile, our team simultaneously worked on the front end, using JavaScript to dynamically update the website with real-time data and chatbot functionalities. To ensure efficient retrieval of past conversations, we deployed a PostgreSQL database to store relevant message history. This allows users to maintain continuity in their interactions.

Finally, to make our project scalable and easily deployable, we designed a deployment pipeline using Docker Compose, ensuring seamless integration and efficient resource management.

Challenges we ran into

Model:

Initially, constructing the RAG model was challenging due to complex dependency resolution, requiring us to find a model that best suited our needs.
Scraping new data proved difficult, as many websites blocked frequent access, often returning 429 errors (Too Many Requests).
Early scraping efforts were ineffective, as they primarily retrieved data from Reddit, which wasn’t always relevant.
Queries that didn’t require scraping (e.g., "Hi") still triggered unnecessary data fetching, increasing response times. To address this, we implemented a mechanism to redirect such queries appropriately.

Back end:

Environment variables were not being read correctly when deploying with Docker.
Exposing and connecting the correct ports for multiple components was challenging.
Initializing the database with init.sql at server startup was difficult to configure.
Writing an automated deployment pipeline with Docker Compose was complex.

Front end:

Dynamically updating the web interface was difficult to implement.
Maintaining a consistent and visually cohesive theme for the front end was challenging.
Constructing POST and GET requests to interact with the API required careful handling.
Implementing chat history loading functionality was tricky.
Ensuring the website was fully responsive across different devices required additional effort.

Accomplishments that we're proud of

Our greatest accomplishment was successfully building a RAG model to fine-tune OpenAI gpt-4 LLM to be specialized in agricultural data. It can access up-to-date information, leveraging real-time scraping with the BM25 algorithm for vector search and relevance ranking. Additionally, it integrates current weather statistics and forecasts for up to 16 days. These insights are then fed into the RAG model, which will enhance OpenAI’s GPT-4 to generate accurate and context-aware responses for users. Equipping our solution with up-to-date information about weather, climate, mitigating solutions, and modern cultivating methods makes its ability stand out with real-time agricultural planning.

With questions or requests regarding cultivations, our model can also provide reference sources and weather reasoning when proposing methods to farmers. Sources include up-to-date organizations’ websites (such as University of Illinois research, the Illinois Department of Agriculture, etc.) and sometimes publications if they are open to the public.

Additionally, we created an interactive and real-time UI to captivate the user as well as provide on-hands information, such as the weather or soil conditions.

A key achievement in our backend development was optimizing real-time data retrieval while maintaining system efficiency. We fine-tuned query routing to reduce unnecessary scraping, significantly improving response times. Additionally, we automated database initialization with init.sql, ensuring a seamless PostgreSQL setup at deployment. One of our most impactful innovations was streamlining multi-service connectivity in Docker containers, which gave way to designing a scalable deployment pipeline with Docker Compose, making the system more resilient and adaptable for future expansions.

Finally, on the front-end, we’re proud to make our website responsive to mobile devices, as well as provide a real-time graph with the information on the weather from above.

What we learned

For some members on our team, this is their first time developing an actual working site, working in a team where each member developed their own programs and merged together to create our final product. Additionally, on the front end, we learned how to make the site responsive. Additionally, on the model end, we’ve managed to utilize two different models to optimize our output, something we’ve never done before.

What's next for Cerestial

Currently, our site displays real-time data of the weather of Illinois. We hope that in the future, we can add more options such as soil moisture, wind, precipitation, snowfall, etc, to the mix so that farmers who use our sites have all the necessary information in one single site. Moreover, we want to add a login option so different users can have different conversations. Finally, we want to fine-tune our model to return high-accuracy answers more efficiently.