Try it out!

https://open-webui-xlww.onrender.com/ username: demo@gmail.com password: demox

Inspiration

College students like us are now expected by courses to use LLMs for pedagogical value, but studying quickly becomes a context engineering problem.

We're both Data Science and Applied Math students at UC Berkeley who've spent hours managing AI context; reformatting notes, re-uploading syllabi, re-explaining where we left off. The real bottleneck in AI-assisted learning isn't the model's intelligence; it's that every session starts from scratch. There's no persistent structure, no evolving knowledge base, no workflow that compounds over time. ChatGPT is a blank text box. Students deserve a system.

We asked: what if we built an opinionated study workflow that maintained its own structured knowledge base per class, one that could pull from anywhere the course lives and grow smarter with every lecture, every study session, every homework assignment?

What it does

YouBook is not an AI assistant; it's an autonomous study system that maintains a living LaTeX notebook per class. It enforces a structured workflow: the student talks, the system decides how to process it, and the notebook evolves accordingly.

3 interaction modes impose different workflows per-message, each with its own context-loading strategy, tool permissions, and pedagogical constraints:

  • /Lec (Lecture): Live lecture dictation. The system transcribes what you say into structured LaTeX with theorem environments, definitions, and summaries. It converts shorthand (-> to \to, int to \int) and never invents content. The agent sees only the preamble, template, and last two lectures, keeping it grounded in what's already been written.
  • /Rev (Review): Active study mode. The system quizzes you, makes cross-lecture connections, and references specific theorems from your notes. It loads all lecture summaries and the glossary, giving it a bird's-eye view of the course so far.
  • /Work (Homework): Guided problem-solving. The system reads your assignment, sees your progress, and gives hints, never solutions. It creates visual TikZ "explainer" documents for tough concepts. Context is scoped to just the relevant assignment, submission, and explainers.

This isn't prompt engineering; it's workflow engineering. Each mode is a different system configuration, not a different persona.

Agentic data import via Composio: YouBook connects to the platforms where course materials actually live. Through Composio's 500+ integrations, the system can autonomously search, browse, and pull files from Google Drive, Notion, Canvas, Slack, Gmail, and more. A student says "grab my lecture slides from Drive" or "pull the problem set from Canvas" and the system finds it, downloads it, and incorporates it into the notebook; no manual uploading, no copy-pasting. Teachers can organize materials anywhere they want; YouBook meets students where the content is.

Two autonomous background agents fire after every session, with no student intervention:

  1. Fact-Check Agent: Searches the web via You.com API to verify historical claims, theorem attributions, and dates in the notebook. Writes a structured report that the chat agent uses to flag issues in the next session. This runs automatically; the student never asks for it.
  2. Progress Narrative Agent: Rewrites a "tutor's journal" (progress.tex), a rich, introspective narrative about the student's intellectual journey. Not a checklist, but a living assessment: "The student has developed a remarkably intuitive grasp of compactness but still relies on mechanical procedure for epsilon-delta arguments..." Teachers could read this to understand exactly where a student stands; it's the kind of qualitative feedback that's impossible to scale manually.

These agents aren't features you toggle on. They're part of the system's loop: study, compile, verify, reflect, repeat.

The notebook is real: a 48+ page compiled PDF with table of contents, auto-generated index of definitions, theorem environments, color-coded summary boxes, and university-level Real Analysis content (5 lectures on ordered sets, Dedekind cuts, countability, metric topology, and compactness from Rudin's Principles of Mathematical Analysis).

How we built it

Architecture: OpenWebUI (forked, SvelteKit) as the frontend chat interface, connected via an SSE pipe to a FastAPI + Agno backend that reads, writes, and compiles a per-class LaTeX workspace. The backend owns the notebook state; the chat is just one surface.

The notebook system uses the LaTeX subfiles pattern: each lecture compiles standalone or as part of a master document. The agent has 6 tools: read_file, write_file, list_files, create_lecture, create_session, and compile_notes (pdflatex + makeindex, 3-pass compilation). PDFs are served inline via a /pdf/ endpoint. The notebook isn't a byproduct of conversation; it's the primary artifact. The conversation serves the notebook, not the other way around.

Composio as the universal data bridge. We built a custom Agno toolkit wrapping the Composio SDK that gives the system agentic access to external platforms. It doesn't just fetch files from a hardcoded source; it uses Composio's tool execution framework to dynamically search, list, and download from whatever platform the course uses. Google Drive today, Canvas tomorrow, Notion next week. The integration is 3 tools (find_file, list_files, download_file) that abstract over Composio's 500+ connectors, so adding a new data source is a configuration change, not a code change. OAuth is managed through Composio's dashboard, requiring no per-student auth flows to build.

Opinionated context loading. This is the core design decision. Each mode loads different slices of the notebook into the agent's context via regex parsing of .tex files: /Lec gets the preamble + template + last 2 full lectures; /Rev gets all lecture summaries + glossary; /Work gets the specific assignment + current submission + explainers. The student never manages context. The system decides what the agent needs to see, and that decision is what makes each mode behave differently, not the prompt.

Background agents follow a fire-and-forget pattern: asyncio.create_task() after /Done mode. The fact-check agent uses an Agno agent with YouComSearchTools (You.com Search API) and a 30 tool-call limit. The progress agent is pure LLM generation: no tools, just synthesis. Both write back to the notebook filesystem, feeding into future sessions automatically.

Deployment: Docker image with Python 3.11 + TeX Live (755MB), deployed on Render via a single render.yaml blueprint. The notebook is baked into the image. OpenWebUI frontend deployed separately on the same Render account.

Challenges we ran into

LaTeX compilation debugging was brutal. pdflatex writes errors to stdout, not stderr, so our compile_notes tool was checking the wrong stream and reporting "success" on failed builds. We also hit "Emergency stop / no legal \end found" errors when our progress agent wrapped LaTeX output in markdown code fences or truncated output by hitting the token limit, omitting \end{document}. We added fence-stripping and auto-appending of missing document endings.

LLM output parsing is inherently messy. The fact-check agent was instructed to output pure JSON, but models love wrapping things in json fences. We added fallback regex extraction and pydantic models. The progress agent had the same issue with LaTeX fences.

Wiring Composio's OAuth across platforms. Getting the system to authenticate against Google Drive through Composio's connected accounts required resolving user IDs dynamically: we query connected accounts at startup, find the active Google Drive connection, and pass the user ID through to every tool execution. The abstraction is clean now, but debugging "why did find_file return nothing" when the issue was an inactive OAuth session was not obvious.

Accomplishments that we're proud of

The system has opinions, and they pay off. YouBook doesn't ask you how you want your notes organized, what format to use, or whether you'd like a quiz. It decides, based on the mode and the state of the notebook. That opinionation is what makes it feel like a workflow instead of a chatbot.

Autonomous fact-checking with You.com. The system independently verifies historical claims like "Hermite proved e is transcendental in 1873" and writes structured reports. It's incremental, only rechecking lectures modified since the last run.

Platform-agnostic data import. Because we built on Composio rather than hardcoding Google Drive API calls, YouBook can connect to any platform a course uses. A professor who puts slides on Drive, problem sets on Canvas, and reading lists on Notion doesn't need to change anything; the student just asks the system to grab what they need.

The \defn{term} auto-indexing system. Every time the system writes a definition, it automatically appears in the compiled index. The notebook maintains itself.

What we learned

Opinionated systems beat flexible assistants. Every design choice that removed optionality made YouBook better. Fixed LaTeX format instead of "choose your output." Automatic fact-checking instead of "would you like me to verify?" Structured modes instead of freeform chat. Students don't need another open-ended assistant; they need a system that knows what good studying looks like and enforces it.

Background agents need guardrails. Without tool_call_limit=30, the fact-check agent would loop forever searching for increasingly obscure claims. Without markdown-fence stripping, the progress agent would corrupt the LaTeX notebook. Defensive coding around LLM output is essential.

LaTeX is worth the complexity. We started with a markdown-first plan but pivoted to LaTeX-native storage. The compiled PDFs with theorem environments, color-coded boxes, and auto-generated indexes are dramatically more useful than markdown renders. The subfiles pattern makes it manageable.

Integration platforms are a multiplier. Building one Composio toolkit gave us access to 500+ services. The alternative, writing OAuth flows and API wrappers for Google Drive, then Canvas, then Notion, would have eaten our entire hackathon. Composio turned "connect to everything" from a multi-week project into a 45-minute integration.

What's next for YouBook

  • Canvas and Notion connectors: The Composio architecture is ready; we just need to activate more connected accounts. Students could say "sync my Canvas assignments" and the notebook auto-populates with due dates, problem sets, and rubrics.
  • Teacher dashboard: Surface the progress narratives to instructors. A teacher could see at a glance which students are thriving, which are struggling with specific concepts, and where the class as a whole needs more attention. The data already exists; it just needs a surface.
  • Persistent storage: Move from baked-in Docker images to Render persistent disks so student work survives redeployments.
  • Spaced repetition: The progress agent already tracks mastery; next step is scheduling review sessions based on forgetting curves. The system would tell you when to study, not wait for you to ask.
  • Collaborative notebooks: Multiple students contributing to the same class notebook, with the system merging and deduplicating content.
  • Two-way sync: Push compiled study guides back to Google Drive or Notion so students can access their notebook from anywhere.

Built With

Share this project:

Updates