Document Classification Assistant

An AI-powered assistant for classifying multi-modal documents into Public, Confidential, Highly Sensitive, or Unsafe categories.

Features

Multi-modal preprocessing for text, PDFs, and images with optional OCR.
Configurable prompt tree library leveraging LangChain/LlamaIndex orchestration patterns.
Dual-LLM cross verification with safety keyword detection.
Citation generation and evidence tracking.
Human-in-the-loop feedback persistence powered by SQLite and SQLAlchemy.
FastAPI backend with Streamlit dashboard for interactive and batch workflows.
Automated report generation in JSON and PDF formats.
Dockerized deployment, comprehensive tests, and formatting/linting via Makefile.

Getting Started

python -m venv .datathon_env
source .datathon_env/bin/activate
pip install -r requirements.txt

Local LLMs via Ollama

Install Ollama on the host where this backend runs and start the daemon with ollama serve (it listens on http://127.0.0.1:11434 by default).
Pull the models referenced in config/config.yaml, for example:
```
ollama pull llama3.1:8b
ollama pull llama3.1:13b
```
Optional: verify the service is reachable with curl http://127.0.0.1:11434/api/tags.

Adjust config/config.yaml if you need a different model name, base URL, or generation options. The default template configures two local models:

ollama:
  base_url: http://127.0.0.1:11434
models:
  primary:
    name: risk-primary-8b
    model: llama3.1:8b
  secondary:
    name: risk-secondary-13b
    model: llama3.1:13b

Restart uvicorn (or the Docker container) so the new configuration is loaded. The backend will now call the local Ollama HTTP API for both classifier passes.

Using Ollama Cloud

Authenticate the Ollama CLI (one time):
```
ollama signin
```

Register the cloud models you plan to call (examples):

ollama pull gpt-oss:120b-cloud
ollama pull deepseek-v3.1:671b-cloud

Copy your Ollama Cloud API key from https://ollama.com.
Export it before starting the backend (or place it in your shell profile):
```
export OLLAMA_API_KEY="sk_live_xxx"
```
The default config/config.yaml already points to https://api.ollama.ai and reads the key via api_key_env: OLLAMA_API_KEY, so no further edits are required. If you prefer to hardcode the key instead, replace the env entry with api_key: "sk_live_xxx" (only do this for local testing).
Restart uvicorn src.main:app --reload. The classifier will now send requests to Ollama Cloud using your key while retaining the same dual-model workflow.

Running the Backend

uvicorn src.main:app --reload

Running the Dashboard

streamlit run ui_dashboard/app.py

Tests and Quality Checks

make test
make lint
make format

Docker

docker build -t doc-assistant .
docker run -p 8000:8000 doc-assistant

Demo Notebook

See demo.ipynb for an end-to-end walkthrough of the sample test cases (TC1–TC5).

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
config		config
data		data
logs		logs
reports		reports
src		src
tests		tests
ui_dashboard		ui_dashboard
.DS_Store		.DS_Store
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
assistant.db		assistant.db
demo.ipynb		demo.ipynb
evaluation_report.json		evaluation_report.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Classification Assistant

Features

Getting Started

Local LLMs via Ollama

Using Ollama Cloud

Running the Backend

Running the Dashboard

Tests and Quality Checks

Docker

Demo Notebook

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

rutul31/agentic_document_classifier

Folders and files

Latest commit

History

Repository files navigation

Document Classification Assistant

Features

Getting Started

Local LLMs via Ollama

Using Ollama Cloud

Running the Backend

Running the Dashboard

Tests and Quality Checks

Docker

Demo Notebook

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages