What is DevDocs?

DevDocs is an AI-driven platform that delivers real-time, precise answers from company documentation. Unlike other AI tools that often rely on outdated syntax or generic data, DevDocs stays current. Whether you need a code snippet, syntax explanation, or an API endpoint, DevDocs ensures you get what you need efficiently.

Key Features

AI-Powered Documentation Search: Uses Llama 3.1 and MODUS inference to generate contextually accurate responses.
Custom Company Integration: Developers can add their own companies and create personalized AI chatbots.
Instant Accessibility: Simple user interface with seamless navigation.
Real-Time Results: Eliminates the need for manual browsing of extensive documentation.

For example, if you’re working with the ElevenLabs API, instead of tediously navigating their documentation, you can simply select "ElevenLabs" on our platform and interact with an AI chatbot that has been trained on the extracted data from their docs. DevDocs simplifies the process, making your workflow seamless and efficient.

Alt text

How DevDocs Works?

Frontend Architecture

The frontend of DevDocs is built using React.js and deployed on Vercel for scalability and ease of use. The homepage provides an intuitive interface with a simple layout:

Homepage:
- Features a prominent "Get Started with DevDocs" button.
- Clicking this button redirects users to a page displaying a list of 10+ companies such as Hypermode, ElevenLabs, CrewAI, Neo4j, etc.

Alt text

2 . Company Selection:

Each company has a "Chat with AI" button.
Clicking the button navigates to a dedicated chat interface for the selected company with a /companyname URL.

Alt text

3 . Adding a New Company:

Users can click the "Add Your Company" button on the /chat page.
This redirects them to DevDocs GitHub repository, where:
- They can follow the instructions to deploy DevDocs locally.
- Add their company details to create a custom AI chatbot for their documentation.

Backend Workflow

The backend architecture is where the true magic of DevDocs happens. It is designed to dynamically fetch, process, and deliver responses in real-time.

Step-by-Step Flow

Welcome Message Generation:
- When the user navigates to a company's chatbot page, the frontend triggers a backend function generateChatbotWelcome.
- This function receives the company name as a parameter and initiates the following processes:
a. Neo4j Database Query:
- Searches the Neo4j graph database for the company’s metadata and documentation links.
- Retrieves relevant details such as company overview, key features, and helpful URLs for getting started.
b. Welcome Message Creation:
- Leverages the Llama 3.1 model with MODUS inference to process the retrieved data.
- MODUS inference ensures optimized, context-aware responses by refining Llama's outputs using advanced inference techniques.
- Generates a concise, professional welcome message that:
  - Introduces the company.
  - Highlights its primary functionalities.
  - Provides documentation links for quick access.
c. Response Delivery:
- The generated welcome message is sent back to the frontend for display.
Interactive Chat Interface:
- Once the welcome message is displayed, users can interact with the chatbot.
- Queries entered by users are processed and responses are generated in real-time.

Example Use Case

User selects "Hypermode" and clicks "Chat with AI."
The backend retrieves Hypermode's details from Neo4j and generates the corresponding welcome message.
The user can enquire anything regarding Hypermode with the chat assistant and it responds with clear, step-by-step guidance.

Alt text

How We Built It

Building DevDocs required a combination of cutting-edge AI technologies, robust backend infrastructure, and an intuitive frontend design. This section provides a detailed explanation of how each component was constructed, focusing on three core stages: Data Scraping, MODUS Backend, and Frontend.

1. Scraping Data

Configuring the Environment

The process begins by setting up the environment variables to securely manage sensitive information such as:

Neo4j database credentials (username and password).
API keys for accessing external tools.
Configuration for the embeddings model.

Connecting to Neo4j Database

We use Neo4j, a graph database, to store and manage our data in a structured knowledge graph format. The steps are as follows:

Database Initialization:
- Use Neo4j credentials to establish a connection.
- Erase any existing schema to start with a clean slate.
- Define a new schema designed to store company documentation data and relationships efficiently.
Schema Design:
- The central node represents the company and includes attributes such as:
  - Name
  - Logo
  - Brief description
- Relationships from the company node:
  - Owns URLs: Links the company to its documentation URLs.
  - URLs Own Documents: Links each URL to the documents it contains.
  - Documents Own Scraped Data: Links each document to its respective content chunks (embeddings).
- Documents include metadata such as:
  - Document ID
  - Timestamp
  - Link to the scraped content. View Interactive knowledge graph
  - View full knowledge graph :

Data Extraction

We use the Crawl4AI GitHub repository to scrape data from company documentation URLs. Crawl4AI is highly efficient for extracting structured data from websites and converting it into Markdown text. The process involves:

Inputting the list of company documentation URLs (100+ URLs across 10+ companies).
Extracting data and saving it in Markdown format for consistent processing.
Storing the extracted data into the Neo4j database as a knowledge graph.

Data Processing: Embeddings

To make the scraped data searchable and AI-ready:

Chunking:
- The scraped text is divided into chunks of 550 characters for efficient storage and processing.
Embeddings model:
- Use the Sentence Transformers library from Hugging Face to generate embeddings for each chunk.
- The embeddings represent the semantic meaning of the text and are essential for efficient retrieval.

This processed data is stored in Neo4j, forming the foundation for our real-time query system.

2. MODUS Backend

The backend is the core engine of DevDocs, integrating Neo4j with advanced AI capabilities. Here's how it works:

Query Processing: RAG Search

We use RAG (Retrieval-Augmented Generation) Search to handle user queries:

Query Embeddings:
- When a user submits a query, it is converted into embeddings using the same embeddings model used for text chunk embeddings.
Similarity Matching:
- The query embeddings are compared against the stored embeddings in the Neo4j database.
- Using cosine similarity, the top 5 most relevant chunks are retrieved.

MODUS Functions

Graph

`isURLPresent`

This function checks whether a specific URL is present in the Neo4j database.

Code :

export function isURLPresent(url: string): boolean {
  if (!url || url.trim().length === 0) {
    return false;
  }

  if (!isDatabaseConfigured()) {
    return false;
  }

  const vars = new neo4j.Variables();
  vars.set("url", url);

  const query = `
    MATCH (u:URL {url: $url})
    RETURN count(u) > 0 as exists
  `;

  const result = neo4j.executeQuery("neo4j", query, vars);
  if (result.Records.length > 0) {
    return result.Records[0].get("exists") === "true";
  }
  return false;
}

Explanation

Validates the input url to ensure it is not empty.
Checks whether the Neo4j database is configured using the helper function isDatabaseConfigured.
Queries the database to check if the URL node exists.
Returns true if the URL exists, otherwise false.

`isCompanyPresent`

This function checks whether a specific company is present in the Neo4j database.

Code :

export function isCompanyPresent(companyName: string): boolean {
  if (!companyName || companyName.trim().length === 0) {
    return false;
  }

  if (!isDatabaseConfigured()) {
    return false;
  }

  const vars = new neo4j.Variables();
  vars.set("companyName", companyName);

  const query = `
    MATCH (c:Company {name: $companyName})
    RETURN count(c) > 0 as exists
  `;

  const result = neo4j.executeQuery("neo4j", query, vars);
  if (result.Records.length > 0) {
    return result.Records[0].get("exists") === "true";
  }
  return false;
}

Explanation

Checks the input companyName for validity.
Ensures the database is properly configured.
Executes a query to determine if a Company node with the given name exists.
Returns true if the company exists, otherwise false.

`semanticSearchURL`

This function performs a semantic search on a specific URL using a query string.

Code :

export function semanticSearchURL(url: string, searchQuery: string): string {
  if (!url || !searchQuery || url.trim().length === 0 || searchQuery.trim().length === 0) {
    return createEmptySearchResult();
  }

  if (!isDatabaseConfigured()) {
    return createEmptySearchResult();
  }

  const queryEmbedding = getStringEmbedding(searchQuery);
  if (queryEmbedding.length === 0) {
    return createEmptySearchResult();
  }

  const embedStr = embeddingToString(queryEmbedding);
  const vars = new neo4j.Variables();
  vars.set("url", url);
  vars.set("embedding", embedStr);

  const cypherQuery = `
    MATCH (u:URL {url: $url})-[:HAS_DOCUMENT]->(d:Document)-[:CONTAINS]->(ch:Chunk)
    WHERE ch.embedding IS NOT NULL
    WITH ch, gds.similarity.cosine(
      ch.embedding,
      [x IN split($embedding, ',') | toFloat(x)]
    ) AS similarity
    ORDER BY similarity DESC
    LIMIT 1
    WITH $url as url, ch.content as content, similarity as score
    RETURN {
      url: url,
      content: content,
      score: toFloat(score)
    } as result
  `;

  const result = neo4j.executeQuery("neo4j", cypherQuery, vars);
  if (result.Records.length > 0) {
    return result.Records[0].get("result");
  }

  return createEmptySearchResult();
}

Explanation

Validates inputs for URL and query string.
When a user submits a query, it is converted into embeddings (using getStringEmbedding function) using the same embeddings model used for text chunk embeddings.
Constructs a Cypher query to match chunks associated with the URL and calculates cosine similarity with the query embeddings.
Similarity Matching:
- The query embeddings are compared against the stored embeddings in the Neo4j database.
- Using cosine similarity, the top 5 most relevant chunks are retrieved.
Returns the most relevant chunk content based on similarity score.

`semanticSearchCompany`

This function searches for semantically relevant content for a company based on a query.

Code :

export function semanticSearchCompany(companyName: string, searchQuery: string): string {
  if (!companyName || !searchQuery || companyName.trim().length === 0 || searchQuery.trim().length === 0) {
    return JSON.stringify<SearchResult[]>([]);
  }

  if (!isDatabaseConfigured()) {
    return JSON.stringify<SearchResult[]>([]);
  }

  const queryEmbedding = getStringEmbedding(searchQuery);
  if (queryEmbedding.length === 0) {
    return JSON.stringify<SearchResult[]>([]);
  }

  const embedStr = embeddingToString(queryEmbedding);
  const vars = new neo4j.Variables();
  vars.set("companyName", companyName);
  vars.set("embedding", embedStr);

  const cypherQuery = `
    MATCH (c:Company {name: $companyName})-[:OWNS]->(u:URL)-[:HAS_DOCUMENT]->(d:Document)-[:CONTAINS]->(ch:Chunk)
    WHERE ch.embedding IS NOT NULL
    WITH u, ch, gds.similarity.cosine(
      ch.embedding,
      [x IN split($embedding, ',') | toFloat(x)]
    ) AS similarity
    ORDER BY similarity DESC
    LIMIT 10
    WITH u.url as url, ch.content as content, similarity as score
    RETURN collect({
      url: url,
      content: content,
      score: toFloat(score)
    }) as results
  `;

  const result = neo4j.executeQuery("neo4j", cypherQuery, vars);
  if (result.Records.length > 0) {
    return result.Records[0].get("results");
  }

  return JSON.stringify<SearchResult[]>([]);
}

Explanation

Similar to semanticSearchURL, but searches across documents associated with a company.
Validates inputs for company name and query string.
When a user submits a query, it is converted into embeddings (using getStringEmbedding function) using the same embeddings model used for text chunk embeddings.
Constructs a Cypher query to match chunks associated with the URL and calculates cosine similarity with the query embeddings.
Similarity Matching:
- The query embeddings are compared against the stored embeddings in the Neo4j database.
- Using cosine similarity, the top 5 most relevant chunks are retrieved.
Returns the most relevant chunk content based on similarity score.

`vectorRagSearchCompany`

This function generates a summarized response for a company-specific query using semantic search results.

Code

export function vectorRagSearchCompany(companyName: string, query: string): string {
  if (!companyName || !query || companyName.trim().length === 0 || query.trim().length === 0) {
    return "Please provide both company name and query.";
  }

  const resultsJson = semanticSearchCompany(companyName, query);
  const results = JSON.parse<SearchResult[]>(resultsJson);

  if (!results || results.length === 0) {
    return `No relevant information found for "${query}" in company "${companyName}"'s documents.`;
  }

  let context = "";
  for (let i = 0; i < results.length; i++) {
    context += "\nSource URL: " + results[i].url;
    context += "\nContent: " + results[i].content + "\n";
  }

  const model = models.getModel<OpenAIChatModel>("text-generator");

  const instruction = `You are a technical assistant...`;

  const input = model.createInput([
    new SystemMessage(instruction),
    new UserMessage(query)
  ]);

  const output = model.invoke(input);
  return output.choices[0].message.content.trim();
}

Explanation

Validates that both companyName and query are non-empty.
Performs a semantic search for the company’s documents using semanticSearchCompany(companyName, query).
If no relevant results are found, returns an error message.
Prepares a context with the URLs and contents of the results.
Uses the text-generator model with specific instructions for generating a concise, code-focused response.
Generates and returns the response based on the provided context.

`vectorRagSearchURL`

This function performs a question-and-answer process using a given URL and query string. It validates inputs, retrieves relevant context from a Neo4j database, and generates a response using a text-generation model.

Code

export function vectorRagSearchURL(url: string, query: string): string {  
  if (!url || !query || url.trim().length === 0 || query.trim().length === 0) {  
    return "Please provide both URL and query.";  
  }  

  const resultJson = semanticSearchURL(url, query);  
  const result = JSON.parse<SearchResult>(resultJson);  

  if (!result || result.url === "") {  
    return `No relevant information found for "${query}" in the provided URL.`;  
  }  

  const context = "\nSource URL: " + result.url +   
                 "\nContent: " + result.content + "\n";  

  const model = models.getModel<OpenAIChatModel>("text-generator");  

  const instruction = `You are a helpful assistant that answers questions based on webpage content.  
    Reply to the user question using ONLY information from the provided CONTEXT.  
    The response should start with a short and concise sentence, followed by a more detailed explanation if relevant.  
    If the context doesn't contain enough information to fully answer the question, acknowledge this in your response.  

    CONTEXT:  
    """  
    ${context}  
    """`;  

  const input = model.createInput([  
    new SystemMessage(instruction),  
    new UserMessage(query)  
  ]);  
  input.temperature = 0.5;  
  input.maxTokens = 500;  

  const output = model.invoke(input);  
  return output.choices[0].message.content.trim();  
}

Explanation

Validates that both url and query are non-empty.
Performs a semantic search on the provided URL using semanticSearchURL(url, query).
If no relevant information is found, returns an error message.
Prepares a context with the URL and its content.
Uses the text-generator model to generate a concise answer based only on the provided context.
Returns the generated response.

`generateChatbotWelcome`

This function creates a dynamic welcome message for any specific documentation chatbot.

Code

export function generateChatbotWelcome(companyName: string): string {
  if (!companyName || companyName.trim().length === 0) {
    return "Please provide a company name.";
  }

  const companyOverviewResults = semanticSearchCompany(companyName, "company overview description mission main business");
  const overviewResults = JSON.parse<SearchResult[]>(companyOverviewResults);

  const docsUrlResults = semanticSearchCompany(companyName, "documentation guides api reference urls links");
  const urlResults = JSON.parse<SearchResult[]>(docsUrlResults);

  let context: string = "";

  if (overviewResults && overviewResults.length > 0) {
    context += "\nCompany Overview:\n" + overviewResults[0].content;
  }

  if (urlResults && urlResults.length > 0) {
    context += "\nDocumentation URLs:\n";
    for (let i = 0; i < urlResults.length; i++) {
      context += urlResults[i].url + "\n" + urlResults[i].content + "\n";
    }
  }

  const model = models.getModel<OpenAIChatModel>("text-generator");

  const instruction = `Generate a welcoming message for a documentation chatbot...`;

  const input = model.createInput([
    new SystemMessage(instruction),
    new UserMessage("Generate a welcome message")
  ]);

  const output = model.invoke(input);
  return output.choices[0].message.content.trim();
}

Explanation

Validates that the companyName is provided and non-empty.
Performs two semantic searches:
- One for the company overview, including description, mission, and main business.
- Another for documentation URLs, guides, and references related to the company.
If results are found for the company overview, adds it to the context.
If documentation URLs are found, it adds them to the context.
Uses the text-generator model to create a dynamic welcome message for the chatbot, based on the gathered context.
Returns the generated welcome message.

MODUS Inference with Llama 3.1

The retrieved chunks and user query are then processed by our text-generator:

    "models": {
      "minilm": {
        "sourceModel": "sentence-transformers/all-MiniLM-L6-v2",
        "provider": "hugging-face",
        "connection": "hypermode"
      },

      "text-generator": {
        "sourceModel": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "provider": "hugging-face",
        "connection": "hypermode"
        }
    }

Integration with Llama 3.1:
- We use the Llama 3.1 model for natural language understanding and response generation.
MODUS Inference:
- MODUS inference enhances the capabilities of Llama 3.1 by:
  - Optimizing its output for accuracy and context.
  - Reducing latency to ensure real-time responses.
  - Dynamically adjusting responses based on user input and retrieved chunks.

Generating Responses

The model generates a precise response by combining:
- The retrieved chunks from Neo4j.
- The context of the user query.
The final response is sent back to the frontend for display.

This robust backend ensures that every query receives an accurate, contextually relevant, and prompt answer.

Challenges we ran into

How could we have made it better?

1. We could have spent more time refining the UI by trying to make it more interactive and user-friendly.

2. API Integration for Text and Code Generation

One of the main challenges we encountered was in the integration of various APIs for text and code generation. We initially thought of integrating Together API into devdocs, creating separate models for text-generation and code-generation. The idea was to have the text-generator produce welcome messages for chatbots and the code-generator to handle more complex code generation.

However, we used Llama 3.1, which, while good, didn’t specifically excel in generating complex code.

How could we have made it better? We could have integrated separate models for text generation and complex code generation by leveraging the Together API’s advanced models for each task. This would ensure that text generation and code generation tasks are handled by models best suited for them.

API Integration with Together API

To improve the integration, we would have made use of the following configuration to combine the text-generator and code-generator models with the Together API:

{
  "$schema": "https://schema.hypermode.com/modus.json",
  "endpoints": {
    "default": {
      "type": "graphql",
      "path": "/graphql",
      "auth": "bearer-token"
    }
  },
  "connections": {
    "neo4j": {
      "type": "neo4j",
      "dbUri": "bolt://34.238.50.12:7687",
      "username": "{{USERNAME}}",
      "password": "{{PASSWORD}}"
    },
    "together": {
      "type": "http",
      "baseUrl": "https://api.together.xyz/v1/",
      "headers": {
        "Authorization": "Bearer {{API_KEY}}",
        "Content-Type": "application/json"
      }
    }
  },
  "models": {
    "minilm": {
      "sourceModel": "sentence-transformers/all-MiniLM-L6-v2",
      "provider": "hugging-face",
      "connection": "hypermode"
    },
    "text-generator": {
      "sourceModel": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
      "connection": "together",
      "path": "chat/completions"
    },
    "code-generator": {
      "sourceModel": "Qwen/QwQ-32B-Preview",
      "connection": "together",
      "path": "chat/completions"
    }
  }
}

Why DevDocs Stands Out

In the fast-paced world of software development, time is everything. As developers ourselves, we’ve often faced the frustration of diving into endless documentation pages, sifting through outdated syntax, and hunting for that one elusive code snippet or API detail. This common struggle inspired us to create DevDocs, a platform designed to give developers instant, accurate, and company-specific answers from documentation. We envisioned a tool that not only saves time but also enhances productivity and eliminates the headache of traditional documentation searches.

Built With

hypermode
modus
neo4j

What is DevDocs?

Key Features

How DevDocs Works?

Frontend Architecture

Backend Workflow

Step-by-Step Flow

Example Use Case

How We Built It

1. Scraping Data

Configuring the Environment

Connecting to Neo4j Database

Data Extraction

Data Processing: Embeddings

2. MODUS Backend

Query Processing: RAG Search

MODUS Functions

isURLPresent

Code :

Explanation

isCompanyPresent

Code :

Explanation

semanticSearchURL

Code :

Explanation

semanticSearchCompany

Code :

Explanation

vectorRagSearchCompany

Code

Explanation

vectorRagSearchURL

Code

Explanation

generateChatbotWelcome

Code

Explanation

MODUS Inference with Llama 3.1

Generating Responses

Challenges we ran into

1. We could have spent more time refining the UI by trying to make it more interactive and user-friendly.

2. API Integration for Text and Code Generation

API Integration with Together API

Why DevDocs Stands Out

Built With

Updates

`isURLPresent`

`isCompanyPresent`

`semanticSearchURL`

`semanticSearchCompany`

`vectorRagSearchCompany`

`vectorRagSearchURL`

`generateChatbotWelcome`