Inspiration

In an era where AI-generated images are indistinguishable from real photographs, trust in visual content is eroding fast. Deepfakes, synthetic profile pictures, manipulated evidence, and fake news powered by AI imagery are real threats to journalism, justice systems, digital security, and personal reputation.

I was inspired to build Forense AI after seeing how easy it became for anyone to generate hyper-realistic images in seconds using tools like Stable Diffusion, Midjourney, and DALL-E. The question "Can I trust this image?" became critical, yet existing solutions were either closed-source, expensive, too technical for non-experts, or lacked proper API integration for developers.

I wanted to create an open, accessible, and developer-friendly forensic tool that combines classical digital forensics with modern AI analysis to answer that question automatically and at scale.

What it does

Forense AI is a REST API built with FastAPI that performs comprehensive forensic analysis on images to detect whether they were generated or manipulated by Artificial Intelligence.

The system combines multiple digital forensics techniques with AI-powered analysis (Google Gemini) to deliver:

  • Consolidated verdict: REAL, AI-GENERATED, or INCONCLUSIVE
  • Risk score (0.0 - 1.0) indicating likelihood of AI generation
  • Confidence level (very_high, high, medium, low)
  • Annotated images highlighting suspicious regions
  • Plain-language explanations accessible to non-technical users

Core Forensic Methods

  1. Noise Analysis (NOISE): Examines natural sensor noise patterns. AI-generated images often have abnormally low or perfectly consistent noise, lacking the organic randomness of real camera sensors.
  2. Fourier Spectrum Analysis (FFT): Analyzes frequency spectrum to detect excessive symmetry, periodic grid artifacts, checkerboard patterns, and unnatural spectral uniformity—all telltale signs of AI generation.
  3. Error Level Analysis (ELA): Recompresses the image and analyzes error differences to detect selective manipulation, splicing, copy-move forgeries, and regions with inconsistent compression history.
  4. Gemini AI Analysis: Interprets technical results contextually and generates human-readable explanations with key indicators in simple language. I also built a clean web interface using React and Vite that allows users to upload images, view real-time analysis results, see annotated visualizations highlighting suspicious regions, and understand the verdict through plain-language explanations.

How we built it

Backend Architecture

  • FastAPI for high-performance REST API with automatic OpenAPI documentation
  • Python with NumPy, OpenCV, SciPy for image processing and forensic analysis
  • Google Gemini API integration for AI-powered contextual interpretation
  • JWT-based authentication supporting both API keys and anonymous sessions

Key Technical Features

  • Flexible authentication: API Key for production integration OR anonymous JWT tokens for instant testing—no registration required
  • Dynamic rate limiting: Users providing their own Gemini API key get 6x higher rate limits (20 req/min vs 3 req/min) and unlimited quotas (200 requests vs 50 per session) - Budget caps: Automatic cost protection with daily ($5) and monthly ($50) limits when using server's Gemini key to prevent abuse
  • Anti-abuse protection: IP-based session limits (max 3 new sessions/hour, 10/day, 5 active simultaneous), intelligent rate limiting, optional reCAPTCHA integration - Session system: Tracks requests and quota usage per session with automatic 7-day expiration and manual cleanup endpoints ### Frontend

  • React + Vite for fast development and optimal performance

  • Tailwind CSS for responsive, modern UI

  • Real-time image upload and analysis

  • Visual display of risk scores, confidence levels, and individual method results

  • Annotated image viewer showing suspicious regions

  • Plain-language explanations for non-technical users

Challenges we ran into

False positives with compressed images: Heavily compressed JPEGs trigger ELA warnings even when legitimate. I solved this by implementing multi-method consensus scoring—no single method determines the verdict; instead, all three forensic techniques must align for high confidence results.

Gemini API cost control: Without caps, malicious users could rack up huge bills on my API key. I implemented daily/monthly budget caps with automatic enforcement, cost tracking stored in JSON files, and automatic cleanup of old data (last 7 days daily, last 3 months monthly).

Noise analysis on edited images: Legitimate crops, resizes, and screenshots often look "suspicious" to noise analysis because they lack original EXIF metadata and sensor patterns. I used Gemini's contextual interpretation to weigh evidence holistically rather than relying on rigid thresholds, reducing false positives significantly.

Session abuse: Early versions allowed unlimited session creation per IP, leading to potential abuse. I added hourly (3 sessions), daily (10 sessions), and simultaneous (5 active) IP-based limits with automatic cleanup after 7 days. Explaining technical results to non-experts: Raw FFT symmetry scores and noise consistency metrics mean nothing to journalists, legal professionals, or everyday users. I integrated Gemini to translate forensic findings into plain language with concrete examples like "The texture of the image is too mathematically perfect, lacking the organic messiness found in real-life photography."

Accomplishments that we're proud of

  • Multi-method forensic engine: Successfully integrated three independent forensic techniques (FFT, NOISE, ELA) with AI interpretation into a single consolidated verdict system
  • Production-ready authentication: Built a dual-auth system that works for both instant demos (anonymous JWT) and serious integrations (API keys) with proper session management
  • Smart cost management: Created a budget cap system that protects against abuse while allowing power users to bring their own Gemini keys for unlimited usage
  • Accessibility without sacrificing depth: Non-technical users get simple verdicts and explanations, while developers get detailed JSON responses with individual method scores, metrics, and annotated images
  • Anti-abuse that works: IP-based rate limiting and session caps prevent abuse without requiring user registration or blocking legitimate use cases
  • Full-stack implementation: Delivered not just an API, but a complete product with a functional web interface ready for demos and real-world testing

What we learned

Technical Growth

  • Digital forensics fundamentals: Deep dive into FFT analysis, noise pattern recognition, JPEG compression artifacts, and how AI generators leave specific "fingerprints"
  • Gemini API integration: Learned how to structure effective prompts for forensic interpretation, handle streaming responses, and manage API costs programmatically
  • Session management at scale: Implemented JWT refresh tokens, session tracking, quota enforcement, and automatic cleanup mechanisms
  • Rate limiting strategies: Built intelligent rate limiting that adapts based on user authentication method and resource usage
  • Cost management: Created budget cap systems with automatic tracking, enforcement, and historical data retention

Product & Design Insights

  • Balancing accessibility and technical depth: Different users need different things—journalists need simple answers, developers need raw data. Good API design serves both.
  • Anti-abuse is critical: Free APIs get abused fast. Proper IP limits, session caps, and cost controls are non-negotiable for any public-facing service.
  • Flexible authentication unlocks adoption: Offering both instant anonymous access (no friction) AND production-ready API keys (reliability) removes barriers for different user types and use cases.
  • Explanation matters as much as detection: A system that says "AI-generated" without explanation is useless. Plain-language reasoning builds trust and helps users understand forensic concepts.

What's next for Forense AI

Model fine-tuning: Train custom detection models on specific AI generators (Stable Diffusion, Midjourney, DALL-E, Flux) to improve accuracy and reduce false negatives

Video support: Extend forensic analysis to detect AI-generated or deepfake videos using temporal consistency analysis and frame-by-frame forensics

Batch processing: Allow bulk analysis of hundreds of images via async job queues for enterprise customers and research institutions

Webhook integrations: Push real-time alerts to Slack, Discord, email, or monitoring platforms when suspicious images are detected in automated workflows

Public dataset: Build and release an open dataset of verified real vs AI images with forensic annotations to help researchers and improve detection models

Enhanced metadata analysis: Extract and analyze EXIF data, camera fingerprints, lens distortion patterns, and geolocation consistency to strengthen authenticity verification

Built With

Share this project:

Updates