Autonomous Medical Coding

Landing page
Streamlit interface to upload excel sheets
Pinecone Vectors
Pinecone Vectors
Logs of excel sheet processing
Docker containers
GKE Workloads

Inspiration

The healthcare industry faces significant challenges with manual medical coding. It's a time-consuming, intricate process requiring specialized knowledge, often leading to bottlenecks in billing cycles and potential revenue loss due to errors. Witnessing this inefficiency, and inspired by the potential of AI to streamline complex workflows (perhaps spurred by Commure's hackathon theme), we decided to build an "Autonomous Medical Coding" system. Our goal was to leverage AI to make finding the correct ICD and CPT codes faster and more accurate, simply by describing the condition or procedure.

What it does

This project provides a foundational system for autonomous medical coding. It allows users to upload standardized Excel files containing medical codes (ICD or CPT) and their descriptions via a simple web interface. The backend service then processes these files, extracts the descriptions, generates high-quality vector embeddings using Google's Vertex AI text-embedding-004 model, and stores these embeddings along with metadata (code, description, type) in dedicated Pinecone vector database indexes. This creates a searchable knowledge base, enabling future development of tools that can retrieve accurate medical codes based on natural language descriptions.

How we built it

We constructed a microservice architecture consisting of: Frontend: A Streamlit application for user interaction (file upload, code type selection) with basic authentication. Backend API: A FastAPI application handling file processing, embedding generation via Vertex AI (TextEmbeddingModel), and data storage. Data Processing: Pandas for reading and cleaning Excel data. Vector Database: Pinecone for storing and indexing the generated embeddings and metadata, using separate indexes for ICD and CPT codes. Containerization: Dockerfiles were created for both services. Deployment: The system was deployed to Google Kubernetes Engine (GKE). Orchestration & Networking: Kubernetes manifests (Deployment, Service, Secret, ServiceAccount) define the deployment. Istio (Gateway, VirtualService) manages ingress traffic, routing, and TLS termination (using Cert-Manager/Let's Encrypt). Authentication: Kubernetes Secrets manage credentials, and Google Cloud Workload Identity provides secure access from the backend to Vertex AI.

Challenges we ran into

Building and deploying this system involved overcoming several hurdles: Environment Setup: Initial Python package installations (NumPy) required troubleshooting C++ build dependencies and Xcode Command Line Tools on macOS. Ensuring Python version compatibility was key. Cloud SDKs & Libraries: Navigating breaking changes in the Pinecone client API and subtle differences in importing/using Vertex AI models and library versions. Vertex AI Authentication/Permissions: Debugging "404 Model not found" errors required systematically checking service account permissions, API enablement, and configurations. Networking (Docker & K8s): Understanding container communication locally (localhost vs. network bridge) and within Kubernetes (ClusterIP Services, DNS) was crucial. GKE/Istio Configuration: Setting up Istio correctly (Ingress, VirtualService routing, LoadBalancer IP, sidecar injection) required careful configuration and debugging using logs and kubectl.

Accomplishments that we're proud of

Successfully building an end-to-end pipeline from file upload to vector storage. Integrating multiple cutting-edge technologies: Streamlit, FastAPI, Vertex AI, Pinecone, Docker, GKE, Istio, and Cert-Manager. Implementing a secure and scalable deployment on GKE using Workload Identity and Kubernetes best practices. Overcoming significant debugging challenges across different layers (code, libraries, cloud services, networking, orchestration). Creating a functional foundation for a powerful AI-driven medical coding tool.

What we learned

This hackathon was a deep dive into several cutting-edge technologies and concepts: Retrieval-Augmented Generation (RAG): Understanding the "embedding" part as a foundation for future RAG systems. Vector Databases & Embeddings: Practical experience using Pinecone and Vertex AI embedding models. Microservice Architecture: Building and connecting decoupled frontend/backend services. Containerization & Orchestration: Using Docker and deploying to GKE with Kubernetes manifests. Service Mesh & Ingress: Implementing Istio for traffic management and security. Cloud Authentication: Configuring Workload Identity for secure GCP access. Debugging Complex Systems: Gaining experience troubleshooting issues across distributed components and cloud environments.

What's next for Autonomous Medical Coding

The current system provides the essential data ingestion and embedding pipeline. Future steps could include: Developing a query interface (API or UI) to search the Pinecone indexes using natural language descriptions or partial codes. Integrating a generative AI model (like Gemini) to synthesize results or provide conversational interaction based on retrieved codes. Implementing more sophisticated chunking or data preprocessing strategies for different types of medical documents (beyond simple code lists). Evaluating and fine-tuning embedding models for optimal accuracy in the medical domain. Adding features for code validation, mapping between code systems, or identifying potential coding conflicts.

Built With

cert-manager
docker
fastapi
gcp
gke
istio
kubernetes
let'sencrypt
pandas
pinecone
python
streamlit
vertexai

Updates

Jay Gala started this project — May 03, 2025 05:56 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.