π InsightForge AI
Multilingual AI Intelligence Engine powered by Amazon Nova Inspiration
In todayβs world, organizations generate massive volumes of documents β reports, Excel sheets, contracts, scanned PDFs, presentations, and multilingual news articles.
Despite advances in AI, extracting executive-level insights still requires manual effort.
We asked ourselves:
What if uploading a document was enough to instantly generate a boardroom-ready dashboard?
That question inspired InsightForge AI β a system that transforms static files into structured intelligence using Amazon Nova.
What it does
InsightForge AI allows users to upload:
π PDF (digital or scanned)
π Excel / CSV
πΌοΈ Images
π Word documents
π½οΈ PowerPoint presentations
π Multilingual content (Tamil, Hindi, English, and more)
From a single upload, the system automatically:
Extracts text (including OCR for scanned PDFs)
Detects document language
Generates an Executive Dashboard
Identifies meaningful KPIs
Computes derived statistics
Extracts key dates and timelines
Detects risks and recommended actions
Enables document chat with evidence grounding
Exports a professional PDF report
All insights are generated in the same language as the uploaded document.
How we built it
InsightForge AI is built using a hybrid AI architecture powered by Amazon Bedrock and Amazon Nova.
1οΈβ£ Multimodal Document Processing
PDF parsing using pypdf
OCR fallback using Nova Vision
Excel parsing using pandas
Word & PowerPoint extraction
Image-to-text conversion via Nova multimodal capabilities
2οΈβ£ Fully AI-Controlled Executive Dashboard
We use Amazon Nova Lite to generate structured dashboard insights in strict JSON format.
For example, when multiple numeric signals exist, Nova dynamically computes:
Inline formula example:
Displayed formula: $$ \text{Risk Score} = \frac{\text{Detected Risks}}{\text{Total Signals}} \times 100 $$ This ensures the dashboard is not static β it is dynamically reasoned. --- ###
3οΈβ£ Retrieval-Augmented Generation (RAG) Documents are: - Chunked - Converted to embeddings - Indexed using FAISS - Retrieved contextually for question answering This enables grounded responses with evidence. Example code structure: python hits, scores = rag.search(user_query, k=top_k) answer = ask_with_evidence(user_query, context_chunks)
4οΈβ£ Multilingual Intelligence We automatically detect document language and instruct Nova to respond strictly in that language. Tamil input β Tamil insights Hindi input β Hindi insights English input β English insights This significantly improves accessibility. --- ## Challenges we ran into ###
πΉ OCR Noise Scanned newspaper PDFs sometimes generate duplicated timestamps like: 2019-02-13 10:30:00 2019-02-13 10:30:00 We implemented text normalization and filtering before AI processing. --- ### πΉ Excel KPI Noise Excel files often contain many numeric columns that are not meaningful KPIs. We solved this by: - Scoring column names semantically - Filtering numeric density - Ignoring small or date-like values - Prioritizing business-relevant signals ---
πΉ Strict JSON Enforcement Large Language Models sometimes output malformed JSON. We enforced: - Structured prompting - Schema validation - Safe parsing with fallbacks --- ## Accomplishments that we're proud of - β Fully multimodal ingestion - β AI-driven executive dashboard - β Language-aware insight generation - β Evidence-backed document chat - β Professional PDF report export - β Clean premium UI with animated AI background Most importantly: > We transformed document upload into instant intelligence generation. --- ## What we learned - Hybrid AI systems are more reliable than pure LLM outputs. - Multilingual prompting significantly improves output quality. - Structured schema prompts dramatically improve AI reliability. - Amazon Nova performs exceptionally well when guided with precise context. ---
What's next for InsightForge AI We plan to expand InsightForge AI into: ### π Enterprise Intelligence Platform - S3 & SharePoint integrations - Real-time KPI monitoring - Risk anomaly alerts - Predictive analytics Future formula-driven forecasting example: $$ \text{Forecast}_{t+1} = \text{Current Value} \times (1 + \text{Growth Rate}) $$ ---
Built With Languages - Python Framework - Streamlit AI & Cloud - Amazon Bedrock - Amazon Nova Lite - Amazon Nova Vision Data & Processing - Pandas - NumPy - FAISS
Document Tools - pypdf - PyMuPDF - python-docx - python-pptx - Pillow Report Generation - ReportLab --- ## Repository InsightForge AI on GitHub
Built With
- ai
- amazon
- amazon-web-services
- bedrock
- css
- iam
- javascript
- multimodal
- nova
- novalite
- novavision
- python
- python-package-index
- rag
- streamlit
Log in or sign up for Devpost to join the conversation.