This project implements a conversational AI voice agent designed specifically for taking food orders at a fast-food restaurant. It leverages LangGraph for managing the conversation flow and interacts with a database for menu information and order persistence. And once the order is confirmed the agent returns a draft order object that can be used to send the order to a third-party service or store it in a database.
graph TD
Start[__start__] --> WelcomeMessage[WELCOME_MESSAGE];
WelcomeMessage --> AudioOutput[AUDIO_OUTPUT];
AudioOutput -.-> AudioInput[AUDIO_INPUT];
AudioInput --> ParseIntent[PARSE_INTENT];
ParseIntent -.-> ConfirmOrder[CONFIRM_ORDER];
ParseIntent -.-> ManualOverride[MANUAL_OVERRIDE];
ParseIntent -.-> ItemSelection[ITEM_SELECTION];
ParseIntent -.-> ModifyOrder[MODIFY_ORDER];
ConfirmOrder --> AudioOutput;
ManualOverride --> End{__end__};
ItemSelection -.-> CheckInventory[CHECK_INVENTORY];
ItemSelection -.-> ReviewOrder[REVIEW_ORDER];
CheckInventory -.-> ItemSelection;
CheckInventory -.-> ModifyOrder[MODIFY_ORDER];
ModifyOrder -.-> CheckInventory;
ModifyOrder -.-> ReviewOrder[REVIEW_ORDER];
ReviewOrder -.-> Upsell[UPSELL];
ReviewOrder -.-> AudioOutput;
Upsell --> AudioOutput;
AudioOutput -.-> End{__end__};
- State Management: The conversation's state is tracked using
AgentStateAnnotation, which includes messages, the current draft order, database query results, and internal flow control flags. Configuration details likebusinessNameorlanguageare passed viaConfigurationAnnotation. - Core Nodes: The graph operates through distinct nodes, each representing a stage in the order process (constants defined in
src/helpers/constants.ts):WELCOME_MESSAGE: Greets the user and initiates the order.AUDIO_INPUT: Captures the user's spoken input.PARSE_INTENT: Analyzes the user's input to determine their goal (e.g., add item, modify order, confirm).CHECK_INVENTORY: Queries the database for menu items, prices, and available modifiers based on the user's request.ITEM_SELECTION: Adds validated items to the draft order.MODIFY_ORDER: Updates the draft order (e.g., adds modifiers, changes quantity, removes items).REVIEW_ORDER: Summarizes the current draft order for the user to review.CONFIRM_ORDER: Finalizes the order details and provides a concluding message.AUDIO_OUTPUT: Sends the agent's spoken response back to the user.
- Conversation Flow:
- The interaction begins with the
WELCOME_MESSAGE. - The agent cycles through
AUDIO_INPUTto capture user speech andPARSE_INTENTto understand it. - Based on the intent, the flow branches to nodes like
ITEM_SELECTION,MODIFY_ORDER, orCONFIRM_ORDER. - Nodes involving menu items (
ITEM_SELECTION,MODIFY_ORDER) typically useCHECK_INVENTORYfirst. - After updates, the flow often proceeds to
REVIEW_ORDER. - Nodes requiring a response (
REVIEW_ORDER,CONFIRM_ORDER, error handling) lead toAUDIO_OUTPUT. - This cycle repeats until the order is finalized via
CONFIRM_ORDERor the conversation ends.
- The interaction begins with the
- Persistence: Conversation state checkpoints are managed using
MemorySaverfor short-term memory. (Note: This can be swapped with a persistent solution like a Postgres checkpointer if needed). - Database: A SQLite database, managed via TypeORM (
src/helpers/db.ts), stores information about categories, products (menu items), modifiers, and finalized orders.
- Initialization: Connects to the SQLite database (
initializeDatabase). - Greeting: Starts the conversation with a welcome message (
welcomeMessageNode). - Order Taking Loop:
- Listens for and processes user voice input (
audioInputNode). - Determines the user's intent (
parseIntentNode). - Handles Item Addition: Checks item availability (
checkInventoryNode) and adds it to the order (itemSelectionNode). - Handles Order Modification: Checks modifier availability if necessary (
checkInventoryNode) and updates the order (modifyOrderNode). - Reviews Order: Presents the current order details to the user (
reviewOrderNode). - Confirms Order: Finalizes the transaction upon user confirmation (
confirmOrderNode). - Provides voice responses throughout the process (
audioOutputNode).
- Listens for and processes user voice input (
- Termination: The conversation concludes upon successful order confirmation or user exit.
- Language Support: Defaults to English (
en). But Spanish, French, and German are also supported.
- Graph Definition:
src/agent/graph.ts- Defines the conversational flow structure. - State Management:
src/agent/state.ts- Defines the data tracked during the conversation. - Node Logic:
src/nodes/- Contains the implementation for each step (node) in the graph. - LLM Prompts:
src/agent/prompts.ts- Stores the prompts used to guide the language model. - Helpers & Utilities:
src/helpers/- Includes database setup, constants, type definitions, and utility functions. - Database Schema:
src/helpers/db.ts- Defines database tables using TypeORM entities. - Database Seeding:
src/helpers/seed.ts- Script to populate the database with initial menu data.
- Bun installed.
- Environment Variables: Ensure necessary environment variables are set.
- Database seeded with initial menu data.
- Install Dependencies:
bun install
- Environment Variables: Create a
.envfile in the root directory and set the required environment variables. You can use the.env.examplefile as a reference.cp .env.example .env
- Seed Database: (Run this once to populate the db)
bun run seed
- Run the Agent:
bun run studio
This will start LangGraph studio which will allow you to test the agent in a web interface.