Inspiration

Every enterprise has a hidden compliance problem: policy documents that contradict each other.

Your Security Policy says passwords need 16 characters. Your Employee Handbook says 12. Which one is right? Which one are employees actually following?

Finding these contradictions traditionally takes 50+ hours of manual review. We built DocOps Agent to do it in under 2 minutes.

What it does

DocOps Agent is an intelligent document analysis platform that:

  • Detects conflicts between policy documents using semantic analysis
  • Shows side-by-side comparisons with visual diff highlighting
  • Suggests AI-powered fixes with confidence scores (e.g., 85% confidence, 5 min to resolve)
  • Tracks resolution workflow from detection to verification
  • Generates compliance reports exportable as Markdown, Excel, or PDF

In our demo: 25 documents analyzed → 66 conflicts detected → 99.9% time saved.

How we built it

Multi-step Agent Reasoning: Unlike simple chatbots, DocOps Agent uses a 6-tool architecture that reasons through document analysis:

  1. search_documents - Hybrid search combining BM25 + dense vectors
  2. analyze_conflicts - Semantic comparison across document sections
  3. get_suggestions - AI-generated remediation recommendations
  4. track_resolution - Alert lifecycle management
  5. generate_report - Automated compliance reporting
  6. check_staleness - Document freshness analysis

Elasticsearch Powers Everything:

  • Hybrid search for intelligent document retrieval
  • Aggregations for real-time analytics dashboards
  • Vector embeddings for semantic similarity detection

Tech Stack:

  • Streamlit for the interactive UI
  • Elasticsearch for search, indexing, and aggregations
  • Python for agent orchestration
  • LLM integration for reasoning and suggestions

Challenges we faced

  1. Semantic vs. Literal Conflicts: "minimum 16 characters" and "at least 12 characters" are semantically related but lexically different. We solved this with hybrid search combining keyword matching and vector similarity.

  2. Confidence Calibration: How confident should the AI be when suggesting fixes? We implemented authority-based scoring that considers document recency and policy hierarchy.

  3. Alert Fatigue: Initial versions flagged too many non-issues. We added severity classification and topic clustering to surface only actionable conflicts.

What we learned

  • Elasticsearch's aggregation framework is incredibly powerful for real-time document analytics
  • Multi-step agent reasoning dramatically outperforms single-prompt approaches for complex tasks
  • The gap between "finding problems" and "suggesting solutions" is where real value lives

What's next

  • Integration with document management systems (SharePoint, Google Drive)
  • Automated policy update workflows
  • Multi-language support for global compliance

Built With

Share this project:

Updates