Inspiration

My dad got scammed by a website that looked completely real. The design was polished, the "trust badges" looked official, and nothing about it screamed scam, until it was too late. That experience made me realize something uncomfortable: for normal users, "does this look legit?" is not a reliable security check anymore.

I built ScamCheck because I wanted a fast, evidence based way to answer one question before anyone buys: can you actually trust this website? Not with a vibe, but with receipts.

What it does

Paste a URL and ScamCheck investigates the website and returns a trust report you can actually act on:

  • Trust score (0–100) with a risk label (Low Risk / Proceed with Caution / High Risk)
  • AI summary explaining what's suspicious or reassuring
  • Recommendation (safe / caution / avoid)
  • Visual scam check from a full-page screenshot
  • Investigation Log showing what the AI searched for and what it found
  • Contradictions Found across the site's own pages
  • Identity Verdict (verified, unverifiable, suspicious, or confirmed fraud)
  • 25+ explainability items so you can see exactly why the score moved

Parameters and signals we check

Security & Infrastructure:

  • HTTPS / TLS certificate validity and issuer
  • Domain age via RDAP lookup
  • Redirect chains and hostname changes
  • robots.txt policies and sitemap presence
  • Hosting platform detection (Shopify, Wix, Squarespace, BigCartel, etc.)
  • 100+ known legitimate domain database

Identity & Contact:

  • Company name extraction and cross-referencing
  • Physical address extraction and verification
  • Email addresses and email-domain matching
  • Phone numbers (international format support)
  • Social media links (Facebook, Instagram, X/Twitter, YouTube, TikTok, LinkedIn, Pinterest)

Content & Trust Signals:

  • Meta tag completeness (title, description, OG tags, canonical, favicon)
  • Copyright year freshness
  • Language quality scoring
  • Mixed US/UK spelling detection (copy-paste indicator)
  • Cookie consent / GDPR compliance indicators

E-commerce Specific:

  • Payment providers (Stripe, PayPal, Square, Apple Pay, Google Pay, Klarna, Afterpay, crypto)
  • Price anomaly detection (extreme 80%+ discounts)
  • Urgency/pressure tactics (countdown timers, "limited stock", "act now", flash sale)
  • Social proof widgets (Trustpilot, BBB, Google Reviews, Judge.me vs. fake badges)
  • Product page and checkout pattern detection

Visual (from screenshot):

  • Fake trust badge detection
  • Urgency banners and countdown timers
  • Layout quality (professional vs. template clone)
  • Popup overlays and aggressive capture elements
  • Stock photo and watermark detection
  • Logo quality and text consistency

Investigation (AI-driven):

  • Identity cross-referencing via Google Search
  • Contradiction hunting across crawled pages
  • Product/price verification against real brands
  • External review scraping (Trustpilot, SiteJabber, ScamAdviser)
  • Outbound link pattern analysis

Gemini 3 tools we used (and how)

ScamCheck is powered by Gemini 3 Flash and uses three Gemini capabilities together.

  1. Multimodal input (Vision) — We pass a Playwright screenshot as an image input. Gemini visually checks for patterns invisible to text analysis: fake badges, countdown timers, template clones, popup overlays.

  2. Thinking (deep reasoning) — The main judge runs with thinking_level="HIGH" to reason across 30+ crawled pages, 35+ structured signals, and the screenshot. It follows a 3-step investigation protocol: cross-reference identifiers, hunt contradictions, verify products.

  3. Google Search grounding (tool use) — The AI cross-references extracted emails, company names, and addresses against live web data to find scam reports, verify business registrations, or detect fraud-linked identities.

Together, this makes ScamCheck more than a scanner. It behaves like an investigator.

How we built it

  1. Frontend — Next.js + React. Paste URL, click Analyze, expand "Show details" for Investigation Log, Contradictions, Identity Verdict, and explainability items.

  2. Backend — FastAPI orchestrating 7 parallel tasks: HTML fetch, TLS inspection, RDAP lookup, robots.txt, Playwright screenshot, spider crawl (~30 pages), and external review scraping.

  3. Crawling — BFS spider with multi-domain support, redirect handling, 70+ ccTLD entries, priority seeding (/about, /contact, /privacy, /terms), JSON-LD preservation. Rust/PyO3 fallback for performance.

  4. Signal extraction — 12 specialized analyzers extract structured data from crawled HTML (see full list above).

  5. Two AI calls — Main Judge (text + signals + screenshot + Google Search investigation) and Visual Analyzer (screenshot only → visual trust score + flags).

Challenges we ran into

The biggest challenge was that scam sites are messy on purpose. Redirect chains, partial blocks, inconsistent templates, and deceptive "professional" design all break naive checks.

We also hit a real crawler bug: after a redirect to a different hostname, we were accidentally rejecting all internal links due to a strict domain check. Fixing that required multi-domain discovery and more careful normalization.

Finally, reliability matters a lot for demos. The full investigation can take longer than a quick scan, so we tuned timeouts end to end and improved structured output handling so the UI doesn't "look empty" if a call takes longer.

Accomplishments we're proud of

ScamCheck can flag high quality "looks real" scam sites while still explaining why, in plain language. We're especially proud of the Investigation Log and Contradictions Found sections, because they make the system transparent instead of mysterious.

Also, this is a full stack build: crawl, screenshot capture, 35+ signals, grounded AI reasoning, and a UI that surfaces it cleanly.

What we learned

Visual analysis matters. Some scam patterns are obvious in the screenshot even if the text is clean.

We also learned that the best results come from combining structured signals, deep reasoning, and grounding. No single technique is enough.

And lastly: redirects and timeouts are not edge cases, they're the normal web.

What's next for ScamCheck

Next we want to ship a browser extension that warns users before checkout, add streaming progress updates in the UI, and introduce caching and "compare over time" so users can see if a site suddenly changes behavior.

We also want stronger identity verification sources and better regional coverage as we test against more real world scam patterns.

Built With

Share this project:

Updates