Inspiration
We realized that human thought is recursive—we circle back, edit, and refine our ideas constantly. Yet, every transcription tool on the market is linear; they just create an endless wall of text. We wanted to build a tool that thinks the way we do: iteratively.
As we refined the idea of our project at the beginning of the hackathon, we got more inspired to turn Latent Loop into reality because we realized that it's important to have live notes to efficiently iterate on our ideas.
What it does
Latent Loop is a recursive note-taking application that listens to your conversations and intelligently updates a single Markdown file in real-time. Unlike traditional transcription tools that linearly append text, Latent Loop understands context and updates existing sections when you circle back to previous topics.
Here's how it works:
- Continuous Voice Recording: Records audio in 7-second chunks for processing
- Smart Transcription: Uses Groq's Whisper API to transcribe audio to text
- Intelligent Routing: Uses ChromaDB vector search to find the most relevant section in your notes (similarity threshold ≥ 0.61)
- Context-Aware Updates: Employs Google Gemini to synthesize new information into existing sections or create new ones
- Real-Time Sync: Updates are broadcast via Server-Sent Events (SSE) to the frontend for instant visual feedback
- Visual Feedback: Changed sections are highlighted and scroll into view with a typewriter animation effect
The app handles contradictions elegantly—if you say "actually, use X instead of Y," it replaces the old information cleanly with strikethrough formatting.
How we built it
Backend:
- Framework: Flask with Flask-CORS for the web server
- Transcription: Groq's Whisper API (
whisper-large-v3) for fast, accurate speech-to-text - Synthesis: Google Gemini Flash 3 for context-aware note updates
- Vector Search: ChromaDB with FastEmbed (
BAAI/bge-small-en-v1.5) for local embeddings to match transcripts to existing sections - Architecture: Single-source-of-truth design—all notes live in
projects/{project-name}.mdfiles - Processing Queue: FIFO queue worker for handling transcript processing sequentially
- Real-time Updates: Server-Sent Events (SSE) for pushing updates to connected clients
Frontend:
- Vanilla JavaScript with Tailwind CSS for rapid iteration without build steps
- Marked.js for Markdown rendering
- MediaRecorder API for browser-based audio capture
- SSE Client for real-time updates from the server
- Typewriter Effect: Custom animation that types out new/updated content character-by-character
- Split-pane UI: Left pane shows live transcript stream, right pane displays the rendered Markdown notes
Data Flow:
- Audio →
/api/audio→ Groq Whisper → transcript - Transcript → Queue → Vector search (ChromaDB) → Find section
- Section + transcript → Gemini → Updated content
- Updated content → Write to notes.md → Sync ChromaDB → Broadcast SSE event
- Frontend receives SSE → Renders update → Scrolls & animates changed section
Challenges we ran into
We iterated with Gemini for refining the idea of Latent Loop, and heavily utilized Claude Opus 4.5 through GitHub Copilot to build the project. As we built the project, we faced several challenges:
- The first implementation of Latent Loop kept creating a new section for each transcribed chunk. We found out that the threshold for similarity of content to a section was set to 0.95 instead of something more reasonable, like around 0.5-0.7, because of a conflict between our prompting to Claude and the ideas we refined through Gemini. We resolved it by overriding the code, testing empirically for an accurate similarity threshold, and modifying our prompting to Claude to ensure understanding.
- If two recordings were processed at the same time, the program tended to crash, or the latter process could override the content from the previous process. To solve this problem, we added a queue system to ensure that the processes are completed in chronological order.
- We struggled with implementing the animations for the modification of the live notes. To tackle this obstacle, we tested different animation styles of the changed text, brainstormed on the best approach to the animation, and created comprehensive instructions for the implementation of the proper animation style.
Last but not least, throughout this project, we heavily relied on Claude 4.5 Opus through GitHub Copilot, and each request took many minutes to complete. To balance execution with speed, we collaborated through Live Share on VS Code and also took turns with writing and prompting code generation, spending a good distribution of time on coding, testing, and documenting the project with user testimonials.
Accomplishments that we're proud of
We are proud that we were able to build a fully functioning MVP of our idea! Specifically, getting the transcription, LLMs, classification, and notetaking methods integrated flawlessly was a massive win.
What we learned
We learned that it is important to have proper time management, especially while our underpaid intern (Claude 4.5 Opus) works around the clock.
What's next for Latent Loop
In the future, the user should be allowed to edit and update the currently read-only version of the notes. In addition, the UI could be improved for better accessibility, reduction of distractions, and relevant animations.


Log in or sign up for Devpost to join the conversation.