Inspiration

The primary spark for this project came from a common developer frustration: onboarding friction. I often find myself wanting to contribute to open-source repositories, but I am frequently overwhelmed by dense, text-heavy README files and complex folder structures. I realized that if I could transform a repository into a multimodal experience, I could lower the barrier to entry for everyone. By using Gemini 3 Pro to generate interactive Mermaid.js workflow diagrams and synchronized audio briefings, getting started becomes as easy as watching a short, narrated walkthrough rather than reading a 5,000-word document.

What it does

The Github Repository Auditor is an agentic tool that instantly digests complex codebases to provide a multimodal onboarding experience. It generates interactive Mermaid.js architectural diagrams, high-fidelity audio briefings with synchronized transcripts, and a deep-reasoning security audit that identifies critical logic flaws and provides actionable remediation snippets.

How we built it

The application follows an "Agentic Workflow" designed in Google AI Studio: Code Analysis: The system ingests the repository files using Gemini 3's 1-million-token context window. Architectural Derendering: Using the model's spatial reasoning, it "derenders" the code structure into a structured JSON schema. Visual & Audio Synthesis: * Gemini 3 Pro generates the Mermaid.js code for system architecture. Gemini 2.5 Flash-TTS creates the "Audio Briefing" using the professional Kore voice profile. Security Auditing: I utilized the Thinking Mode with a thinking_level="high" setting to perform deep logic-trace audits for vulnerabilities.

Challenges we ran into

The most significant hurdle was API Quota Management. During development, the Gemini 3 Pro "Preview" tokens were frequently exhausted due to the heavy token load of repo-wide analysis. To solve this, I implemented an Exponential Backoff and Key Rotation strategy. I also optimized the token usage by calculating the "Thinking Budget" dynamically. Balancing the thinkingBudget was a mathematical trade-off between the depth of the security audit and the speed of the response.

Accomplishments that we're proud of

Deep Reasoning Integration: Successfully implemented Gemini 3’s Thinking Mode to trace cross-file logic, moving beyond simple pattern matching to find deep-seated architectural risks. Multimodal Orchestration: Seamlessly synced Gemini 2.5 Flash-TTS audio with a dynamic "single-line" transcript UI for a "hands-free" developer briefing. Agentic Reliability: Achieved 100% UI parsing accuracy by utilizing Strict JSON Structured Outputs, ensuring the audit data remains consistent across complex repository structures.

What we learned

Context Window Mastery: Learned to leverage the 1M token context window to ingest entire repositories without losing the "thread" of the codebase's logic.Thinking Budgets: Discovered the mathematical balance of $T_{thought}$ (Thought Tokens) to optimize for both deep security analysis and API quota efficiency.Vibe Coding Workflow: Shifted from manual boilerplate writing to high-level intent description using AI Studio’s Build mode, drastically accelerating our development cycle.

What's next for Github Repository Auditor

AuditorInteractive Chat-over-Code: Integrating the Gemini Live API to allow developers to ask verbal questions about the repository while viewing the diagrams. Automated PR Fixes: Expanding the audit feature into an autonomous agent that can open GitHub Pull Requests with the suggested security fixes. Expanded Modalities: Adding Veo 3 video generation to create "tours" of the UI and frontend components directly from the source code.

Built With

  • gemini3
  • genai
  • github-api
  • google-ai-studio
  • google-cloud-run
  • json
  • mermaid.js
  • sdk
  • streamlit
  • streamlitaudio
Share this project:

Updates