Inspiration Drug discovery is a highly time-consuming and resource-intensive process that often takes 10–15 years and billions of dollars to bring a single drug to market. Despite advances in AI-driven molecular generation, most systems remain research-focused and lack usability, reproducibility, and integration into practical research workflows. Many generative models struggle with limited diversity, unstable training, or chemically invalid outputs. Researchers still face challenges in rapidly generating diverse, optimized, and biologically meaningful drug candidates. Pindora Shield was built to bridge the gap between academic AI models and real-world drug discovery needs by creating a biologically grounded, end-to-end pipeline for intelligent molecule generation and evaluation.

What it does Pindora Shield is an AI-powered drug candidate generation and evaluation system. It begins by identifying disease-relevant protein targets, retrieves known molecules, generates novel candidates using a generative model, evaluates them across multiple pharmacological properties, filters low-potential candidates, and produces a ranked comparative analysis against existing drugs. The system ensures chemical validity, structural diversity, biological relevance, and multi-property optimization in a structured workflow.

How we built it The system operates in six major stages. First, Disease and Target Identification is performed using curated biomedical sources such as ChEMBL and Open Targets to ensure biological grounding before molecule generation. Second, Molecular Retrieval and SMILES Processing standardizes known molecules into SMILES representation for consistent downstream analysis. Third, De Novo Molecule Generation is performed using TangGen, a GAN-inspired framework designed to generate diverse and chemically valid alternative SMILES while exploring unexplored chemical space. Fourth, Multi-Model Molecular Evaluation evaluates each generated SMILES across five independent predictive models including IC50 prediction, target association score, clinical phase likelihood, target relevance, and exploratory pharmacological profiling. Fifth, Filtering and Shortlisting combines predicted outputs to remove low-potential molecules and retain pharmacologically relevant candidates. Sixth, Comparative Ranking compares shortlisted molecules with existing medicines and generates a ranked analysis to support research-level decision-making. The system is modular, with isolated components, independent predictive models, stateless backend APIs, graceful error handling, and frontend validation to ensure reliability and fault tolerance.

Challenges we ran into One major challenge was preventing mode collapse in the generative model while maintaining structural diversity. Another challenge involved ensuring chemical validity of generated SMILES strings. Multi-property optimization introduced conflicts where improving potency could negatively impact other drug-relevant attributes. Managing independent predictive models while preventing cascading failures required careful system design. Converting research-level AI models into a usable, researcher-friendly system also required significant architectural refinement.

Accomplishments that we're proud of We successfully built a complete end-to-end AI-driven drug discovery pipeline. We implemented TangGen to automate lead optimization through structural variants. We designed a modular multi-model evaluation framework for improved interpretability and robustness. We ensured biological grounding through disease–protein interaction analysis before molecule generation. We deployed a live working system with structured failure handling and ranking-based decision support.

What we learned We learned that drug discovery is a multi-dimensional optimization problem requiring biological context before generative modeling. Independent specialized models often provide better interpretability than monolithic multi-task systems. Chemical validity checks and defensive system design are essential for real-world deployment. Practical AI systems require robust architecture beyond strong model performance.

Built With

  • azureopenai
  • chembelapi
  • fastapi
  • gan
  • opentargetapi
  • pydantic
  • python
  • pytorch
  • rdkit
  • transformer
Share this project:

Updates