Inspiration
This project was inspired by wanting to reduce hallucinations in LLMs to eventually use for use cases like in the medical, legal, and financial industries where hallucinating information will have a large potential downside.
What it does
This project implements a system that integrates a large language model with a knowledge graph to enable more factual and interpretable natural language generation. The system can take a natural language query, retrieve relevant facts from the knowledge graph, and use those facts to guide the language model to generate an informative response.
How we built it
- Created a knowledge graph using LLM which contains facts extracted from various sources like papers
- Built a retriever module that takes a natural language query and retrieves relevant facts from the knowledge graph
- Built to frontend to visualize and compare different retriever methods performance
Challenges we ran into
Ingestion:
- It takes a long time to generate knowledge graphs using LLMs, and the long context window of Claude-2 isn't helping to extract more triplets because triplet extraction requires chunking
- Need some middleware, possibly backed by a knowledge graph, for better named entity extraction
Querying:
- LLM can't reliably generate Cypher query to run on graph db
Retrieval:
- There are certain questions that KG underperform traditional vector method
Accomplishments that we're proud of
- Build an end-2-end flow from generating knowledge graph, ingest to graph db, visualize graph, and query with a frontend
What's next for Data Monkeys
- More in-depth exploration for the different areas surrounding integrating knowledge graphs with LLMs
- Optimizing building and querying speed
Built With
- flask
- javascript
- llamaindex
- nebulagrahdb
- next.js
- python
Log in or sign up for Devpost to join the conversation.