Truth Trace

Image of the frontend running its analysis on a suspicious url

Inspiration

We come across many websites and articles in our daily life that seem manipulative and driven. But we often are shadowed by the feeling of them being 100% factually and sentimentally correct, thus ignoring the doubt of any fake content, manipulative intentions and possible lack of human language. Therefore, we identified, three target areas: The original Authenticity of a website based on the url. Originality check of the content in the webpage - detect generative AI (How humanized is it ?) The characteristic nature of the content available. Articles and information is considered to be presented in a neutral, bias less manner, however sometimes, the matter can be manipulative and this layer checks for such manipulation.

What it does

TruthTrace analyzes any URL you submit to determine if it's trustworthy or trying to deceive you. It runs the link through multiple forensic checks—validating the domain's infrastructure, scanning reputation databases for known threats, extracting the actual content, and using AI to detect manipulative language patterns. It uses several pre built apis, such as the WhoIsFreaks API, text-extractor by RapidAPI, and VirusTotal API to check domain validity, extract text from the webpage and also run other infrastructure checks on the overall web address. The system the generates a threat score (0-100) and assigns a risk level (Low, Medium, or High) based on what it finds. The system gives a detailed breakdown through the OpenAI API, explaining exactly why the URL is flagged—whether it's a suspicious domain, misleading headlines, emotional manipulation, or factual inconsistencies.

How we built it

TruthTrace was built as a layered forensic intelligence engine that combines infrastructure analysis, linguistic heuristics, threat intelligence APIs, and AI-powered contextual reasoning

Backend - We built the backend of the system using Spring Boot and Java to ensure modularity and scalability The main component of the backend was the engine which consisted of various classes necessary for analyzing the components of the webpage, starting with the URL. The first layer ran an overall check on the URL, with importance on whether it is reachable, and is used with a valid protocol, detect suspicious keywords in the domain, verify the domain age using the WhoIsFreak api, and also run other general diagnostics on the whole url such as subdomain count, validity of the subdomains etc. The second layer, worked with the content present on the webpage. We used the text-extractor api by RapidAPI, to extract precisely, the content of the webpage, and normalized the raw content for further linguistic analysis. The third layer worked with linguistic analysis such as checking for social engineering phrases like "act now", etc., scam hooks like "lottery", "money", etc., emotional manipulation signals, over punctuation patterns, lack of human quoted content, etc. The fourth Layer used the OpenAI api to collect all the major findings of the analysis, converting it to a prompt and generating a response that felt like a detective breaking down a crime scene, to add a level of user familiarity and elevate the user interaction experience.
Frontend - The frontend was built using a custom dark-mode forensic UI. Instead of dumping results instantly, we designed: Live threat score fluctuation during analysis Progressive checklist activation Streaming forensic pipeline logs Controlled explanation rendering Noir-inspired detective theme

Challenges we ran into

While building this project, we faced several technical and practical challenges. One early obstacle was difficulty obtaining and configuring API access for Google Gemini, which delayed experimentation and required contingency planning. We also realized there is no single API that provides a definitive “truthworthiness score” for a website. Instead, we had to combine multiple services—fact-check databases, URL reputation checks, content analysis, and metadata evaluation—into a unified scoring framework. Identifying reliable APIs, understanding rate limits, and integrating inconsistent response formats was itself a significant challenge.

Accomplishments that we're proud of

One accomplishment we are particularly proud of is successfully designing and integrating a multi-signal deception detection system within a limited timeframe. Despite API access challenges, performance constraints, and the absence of a single “truth” endpoint, we built a working pipeline that combines URL reputation checks, content analysis, content sentiment, and metadata evaluation into a unified, explainable risk score. Rather than relying solely on an LLM or a single heuristic, we developed a balanced framework that shows an overall risk score for a given url. We are especially proud that our system emphasizes transparency—showing why a website is flagged—making it more trustworthy and practically useful for real-world misinformation detection

What we learned

Through this project, we learned how important rapid ideation and decisive execution are in a time-constrained environment. We practiced quickly brainstorming multiple ideas, evaluating feasibility, and once a direction was finalized, moving immediately into implementation rather than over-planning. When APIs failed, keys didn’t work, or certain approaches proved impractical, we learned to pivot quickly—switching technologies, replacing tools, and debugging iteratively instead of getting stuck. We also gained experience in integrating multiple external services, handling real-world data inconsistencies, managing performance tradeoffs, and collaborating efficiently as a team. Overall, this project strengthened our ability to think critically and fast under pressure, adapt to technical uncertainty, and transform abstract ideas into a functional, end-to-end system. We are really proud of what we have done in this short span of time.

What's next for Truth Trace

Performance Optimization We're prioritizing significant speed improvements to the frontend experience. By optimizing load times, reducing latency, and streamlining our interface, we'll ensure users can verify content quickly and seamlessly - critical for real-time fact-checking scenarios.

Mobile Application Development We're expanding TruthTrace's accessibility by developing native mobile applications. This will allow users to verify information on-the-go, directly from their smartphones, making truth verification as convenient as checking social media. Mobile-first access is essential for reaching users where they consume content most.

AI-Generated Media Detection We're building advanced detection capabilities to identify AI-generated images and videos. As synthetic media becomes increasingly sophisticated and prevalent, this feature will help users distinguish between authentic and artificially created visual content - addressing one of the most pressing challenges in digital media verification today.