Inspiration
The primary spark for this project came from a common developer frustration: onboarding friction. I often find myself wanting to contribute to open-source repositories, but I am frequently overwhelmed by dense, text-heavy README files and complex folder structures. I realized that if I could transform a repository into a multimodal experience, I could lower the barrier to entry for everyone. By using Gemini 3 Pro to generate interactive Mermaid.js workflow diagrams and synchronized audio briefings, getting started becomes as easy as watching a short, narrated walkthrough rather than reading a 5,000-word document.
What it does
The Github Repository Auditor is an agentic tool that instantly digests complex codebases to provide a multimodal onboarding experience. It generates interactive Mermaid.js architectural diagrams, high-fidelity audio briefings with synchronized transcripts, and a deep-reasoning security audit that identifies critical logic flaws and provides actionable remediation snippets.
How we built it
The application follows an "Agentic Workflow" designed in Google AI Studio: Code Analysis: The system ingests the repository files using Gemini 3's 1-million-token context window. Architectural Derendering: Using the model's spatial reasoning, it "derenders" the code structure into a structured JSON schema. Visual & Audio Synthesis: * Gemini 3 Pro generates the Mermaid.js code for system architecture. Gemini 2.5 Flash-TTS creates the "Audio Briefing" using the professional Kore voice profile. Security Auditing: I utilized the Thinking Mode with a thinking_level="high" setting to perform deep logic-trace audits for vulnerabilities.
Challenges we ran into
The most significant hurdle was API Quota Management. During development, the Gemini 3 Pro "Preview" tokens were frequently exhausted due to the heavy token load of repo-wide analysis. To solve this, I implemented an Exponential Backoff and Key Rotation strategy. I also optimized the token usage by calculating the "Thinking Budget" dynamically. Balancing the thinkingBudget was a mathematical trade-off between the depth of the security audit and the speed of the response.
Accomplishments that we're proud of
Deep Reasoning Integration: Successfully implemented Gemini 3’s Thinking Mode to trace cross-file logic, moving beyond simple pattern matching to find deep-seated architectural risks. Multimodal Orchestration: Seamlessly synced Gemini 2.5 Flash-TTS audio with a dynamic "single-line" transcript UI for a "hands-free" developer briefing. Agentic Reliability: Achieved 100% UI parsing accuracy by utilizing Strict JSON Structured Outputs, ensuring the audit data remains consistent across complex repository structures.
What we learned
Context Window Mastery: Learned to leverage the 1M token context window to ingest entire repositories without losing the "thread" of the codebase's logic.Thinking Budgets: Discovered the mathematical balance of $T_{thought}$ (Thought Tokens) to optimize for both deep security analysis and API quota efficiency.Vibe Coding Workflow: Shifted from manual boilerplate writing to high-level intent description using AI Studio’s Build mode, drastically accelerating our development cycle.
What's next for Github Repository Auditor
AuditorInteractive Chat-over-Code: Integrating the Gemini Live API to allow developers to ask verbal questions about the repository while viewing the diagrams. Automated PR Fixes: Expanding the audit feature into an autonomous agent that can open GitHub Pull Requests with the suggested security fixes. Expanded Modalities: Adding Veo 3 video generation to create "tours" of the UI and frontend components directly from the source code.
Built With
- gemini3
- genai
- github-api
- google-ai-studio
- google-cloud-run
- json
- mermaid.js
- sdk
- streamlit
- streamlitaudio
Log in or sign up for Devpost to join the conversation.