Inspiration — (October 2024)
Alzheimer’s gradually erodes recognition and conversational continuity, creating emotional distance even when family members are deeply present. We wanted to preserve interaction, not just information, designing a system that recreates familiar conversational dynamics while remaining safe, interpretable, and human-in-the-loop.
What It Does
DEAR is a voice-first, multi-agent system that simulates familiar conversations between Alzheimer’s patients and their family members.
Family members upload short voice samples, which are used to generate personalized voice agents. When a call is initiated, DEAR orchestrates a live conversation where:
- A voice agent handles natural speech interaction
- A conversation agent manages dialogue state
- A monitor agent tracks confusion, repetition, and escalation signals
- A nurse agent intervenes when thresholds are crossed
- A RAG agent retrieves only contextually relevant memory fragments
The result is a conversation that feels familiar, adaptive, and emotionally grounded — without automating diagnosis or replacing clinicians.
How We Built It (Technical)
System Architecture
At its core, DEAR functions like a doubly-linked conversational OS:
- Each interaction node links backward to prior context and forward to possible next states
- Memory, family identity, and conversational triggers form a knowledge graph traversed as a linked list
- Agents pass control explicitly, not implicitly, enabling safe escalation and rollback
This mirrors OS-level scheduling: no single agent “owns” the conversation.
Frontend
- Next.js + TailwindCSS
- Voice recording UI streams audio directly to backend endpoints
- Successful uploads trigger agent-availability notifications and guide users into live calls
Backend
- Flask API layer
Handles:
- Audio ingestion (
/handleFamilyAgentCreation) - Voice embedding generation
- Agent provisioning
- Call lifecycle management (
/initiateConversation,/endConversation)
- Audio ingestion (
Verified via 200-status responses and logged embeddings (see proof)
Voice & Agents
- Cartesia: voice cloning + embeddings
- VAPI: outbound calls + agent orchestration
- Speech-to-Text → LLM → Text-to-Speech pipeline for reliability under hackathon constraints
Data Model (Proof-Linked)
The database schema forms the backbone of the linked system:
patients↔family↔memoryMemory nodes store:
- embeddings
- triggers
- timestamps
- confusion flags
This enables selective retrieval, not global recall, crucial for safety and emotional coherence.
Key Technical Insight
We intentionally modeled conversational memory as a linked structure, not a flat vector store.
This allows:
- Controlled traversal
- Bounded context windows
- Explicit escalation paths
- Interpretability of agent decisions
In other words: conversation as state, not vibes.
Challenges
- Managing multi-agent coordination without race conditions
- Avoiding over-automation in a sensitive clinical context
- Balancing system ambition with a hackathon timeline
We solved this by reducing the system to explicit links and handoffs — fewer hidden abstractions, more guarantees.
Accomplishments
- End-to-end working prototype: UI → backend → voice → live call
- Verified agent creation and embedding pipelines
- Designed a scalable, safety-aware conversational architecture
- Demonstrated real multi-agent orchestration under time pressure
What We Learned
- Multi-agent systems behave more like operating systems than chatbots
- Linked structures outperform monolithic context windows in safety-critical domains
- Voice UX forces architectural honesty — latency and state errors are immediately visible
What’s Next
- Transition to real-time Speech-to-Speech
- Expand memory graph traversal logic
- Longitudinal emotional state tracking (with human oversight)
- Clinician-configurable thresholds and escalation policies
Tech Stack
- Next.js, TailwindCSS — frontend
- Flask — backend orchestration layer
- Cartesia — voice cloning & embeddings
- VAPI — agent creation & outbound calling
- LLMs + RAG — conversational reasoning & memory retrieval
Built With
- agent
- api
- backend
- cloning.
- crostaia
- engineering
- flask
- frontend.
- next.js
- prompt
- tailwindcss
- vapi
- voice
Log in or sign up for Devpost to join the conversation.