Inspiration

Every university student knows the feeling: you're lost in a 300-person lecture, too afraid to raise your hand, and office hours are packed. Professors want to help every student individually, but there simply aren't enough hours in the day. We asked ourselves — what if every student could have a 1-on-1 session with their professor, powered by AI, right inside Zoom?

Google Drive with demo video and more materials about this project: https://drive.google.com/drive/folders/1erRwy80BjeV3pBOeuafkrem6VlTP2hqX?usp=sharing

What it does

LectureFlow takes a professor's live lecture and turns it into a full, personalized learning experience — all delivered through Zoom, the platform students and professors already use.

The pipeline works like this:

  1. Lecture In — A professor delivers their lecture over Zoom as they normally would. LectureFlow captures the audio and video via Zoom's Real-Time Media Streams (RTMS) API, transcribes it, and uses an LLM to break the content into discrete conceptual scenes.
  2. Animated Explainers Out — For each concept, the system automatically generates a 3Blue1Brown-style Manim animation with voice-cloned narration from the professor. These aren't generic — they're visual explanations tailored to the specific ideas covered in that day's lecture.
  3. Quizzes Linked to Concepts — From those same concepts, the system generates multiple-choice quiz questions. Each question is tied back to the explainer video that covers it. Students type /quiz in Zoom Team Chat and get an adaptive quiz — when they answer wrong, they watch the specific 2-minute animated explanation for that concept before moving on.
  4. AI Professor in Every Breakout Room — Using HeyGen's Interactive Avatar SDK, we clone the professor's face and voice into a lifelike AI avatar deployed via LiveKit streaming. LectureFlow creates Zoom breakout rooms and places a personalized AI tutor in each one. The avatar has full context of that day's lecture material — updated in real-time from RTMS transcripts — uses Socratic questioning, and adapts to each student's pace. Because it looks and sounds like their actual professor, students engage with it naturally rather than treating it like a generic chatbot.
  5. Real-Time Professor Dashboard — The professor monitors everything from an Electron app: live transcripts from the lecture, quiz performance across the class, and which concepts students are struggling with.
  6. Live Student Sentiment — By running expression recognition models on periodically sampled video frames from Zoom's RTMS feed, LectureFlow gives professors a real-time, aggregated view of classroom engagement — surfacing confusion, disengagement, or frustration as it happens, not after the fact.

The key insight is that everything flows from the same source material — that day's lecture. The lecture produces the concepts, the concepts produce the animations, the animations back the quizzes, the quizzes inform the tutoring, and the professor sees it all in one place.

For demo purposes at TreeHacks, we use YouTube lecture links as input since we can't host a live lecture on the spot — but the pipeline is the same regardless of whether the audio comes from a live Zoom session or a recording.

How we built it

LectureFlow is four microservices working together:

  • Manim Video Pipeline (Python + asyncio) — Ingests lecture audio, transcribes with Whisper, uses an LLM to plan conceptual scenes, generates Manim animation code for each, renders in parallel, and stitches them with voice-cloned narration (ElevenLabs/PocketTTS). Quiz questions are generated from the same scene plan, each linked back to its explainer video.
  • Python Backend (FastAPI) — The orchestration layer. Manages session lifecycle, creates HeyGen interactive avatar sessions with real-time lecture context, serves adaptive quizzes through a Zoom Team Chat chatbot, processes video frames for student sentiment analysis, and stores everything in a SQLAlchemy database.
  • Zoom RTMS Bridge (Node.js on Render) — A Render-hosted service that receives Zoom webhooks, opens Real-Time Media Stream connections to live meetings, and forwards transcript chunks, chat messages, and video frames to the Python backend over HTTP and WebSocket. It also triggers quiz DMs to students when they first speak in the meeting. Render was essential here — it gave us a publicly reachable endpoint for Zoom webhooks, solving the NAT traversal problem, while relaying events over WebSocket to our local backend.
  • Electron App (React + TypeScript) — The professor's control center. Create sessions, monitor live transcripts in real-time, trigger quizzes, interact with HeyGen avatars, and review class-wide analytics — all in a frosted-glass UI.

Challenges we ran into

  • Manim Code Generation — LLMs frequently generate Manim code that doesn't compile. We built a retry system with error feedback loops, parallel rendering with semaphores, and fallback strategies to keep the pipeline robust.
  • Zoom RTMS Integration — Getting real-time audio, video, and transcripts out of Zoom required navigating the RTMS WebSocket protocol — handling media handshakes with S2S OAuth signatures, parsing mixed audio/video/transcript/chat message types, and maintaining persistent connections with keep-alive heartbeats every 100ms.
  • HeyGen Avatar Context Injection — Making the HeyGen avatar feel like a real extension of the professor meant continuously injecting lecture context from RTMS transcripts into the avatar's knowledge base mid-session, so it could reference what was just taught moments ago rather than relying on static prompts.
  • Unifying the Pipeline — The hardest design problem was making the concept graph the single source of truth. Scene planning, video generation, quiz questions, and tutoring context all needed to reference the same conceptual breakdown of the lecture, so changes upstream propagate cleanly.
  • Interactive Team Chat Quizzes — Building a stateful, per-student quiz experience inside Zoom Team Chat meant managing quiz sessions across webhook-driven interactions — tracking progress, handling button clicks for A/B/C/D answers, generating follow-up questions on wrong answers, and linking back to the exact explainer video for each concept.

Accomplishments that we're proud of

  • A single pipeline that takes one lecture and produces animated explainers, targeted quizzes, and personalized AI tutoring — all interconnected
  • Real-time lecture understanding via Zoom RTMS that feeds transcripts directly into HeyGen avatar context, so the AI tutor always knows what was just taught — not a static bot, but one that evolves with the lecture
  • A fully automated system that generates 3Blue1Brown-quality animated videos with voice-cloned narration from any lecture
  • An interactive quiz system inside Zoom Team Chat that plays the exact explainer video for concepts a student gets wrong
  • Live student sentiment analysis from RTMS video frames, giving professors real-time engagement visibility
  • ~11,000 lines of production-quality code across Python, TypeScript, and JavaScript — built in one weekend

What we learned

  • Zoom's API ecosystem is remarkably deep — RTMS for live media, Team Chat for interactive bots, S2S OAuth for server integration — and once you understand how the pieces fit together, it enables workflows that wouldn't be possible on any other platform.
  • HeyGen's Interactive Avatar SDK made something possible that we couldn't have built from scratch in a weekend: a real-time, lifelike video avatar that students actually want to talk to. The ability to inject context mid-session was the key to making it feel like a real tutor, not a canned demo.
  • Render gave us instant public deployment for our RTMS bridge — no infrastructure headaches, just a webhook URL that worked. For a hackathon, that speed matters.
  • LLM-generated code (especially Manim) needs robust validation and retry mechanisms; you can't trust it to compile on the first try.
  • The biggest leverage in an AI education tool isn't any single feature — it's making them share the same conceptual backbone so the experience feels cohesive, not like five disconnected tools.

What's next for LectureFlow

  • RAG-Powered Context Engine — Letting professors upload syllabi, textbooks, and slides so the AI tutor can reference specific course materials during conversations.
  • On-the-Fly Quiz Generation — Generating quizzes from live lecture transcripts in real-time, closing the loop even further.
  • Multi-Language Support — Translating explainer videos and avatar responses for international students.
  • Production Deployment — Dockerized horizontal scaling with PostgreSQL and a polished onboarding flow for professors.

Built With

  • asyncio-apis-&-ai-services:-zoom-rest-api
  • cerebras-(llama-3.3-70b)
  • chromium-(headless)
  • deepgram
  • electron
  • express.js
  • ffmpeg-databases:-sqlite
  • heygen-interactive-avatar-api
  • javascript-frameworks:-fastapi
  • languages:-python
  • node.js
  • openai
  • pockettts
  • postgresql-tools:-yt-dlp
  • puppeteer
  • react
  • sqlalchemy
  • tailwind-css-libraries:-manim
  • typescript
  • uv
  • whisper-platforms-&-infrastructure:-render
  • zoom-meeting-sdk
  • zoom-rtms
  • zustand
Share this project:

Updates