Somatica

About the project

Inspiration

Aphasia affects millions of stroke survivors and people with neurodegenerative conditions. In many cases, the person’s intelligence and intent are fully there, but the words come out as fragments like “Water… want” or “Drive… store.” That gap between what someone means and what they can produce creates constant friction. Caregivers end up guessing, conversations slow down, and people can feel isolated.

We wanted to build something that makes communication feel fast and natural, and that looks and feels like modern tech rather than a stigmatizing medical device.

What Somatica does

Somatica is an AI powered communication assistant for people with aphasia.

Listen: The user presses a large yellow button and speaks a few keywords
Transcribe: Speech is converted to text via Whisper
Interpret: A language model expands the fragment into intended meaning using context
Suggest: Somatica shows at most three clear sentence options, optimized for accessibility
Speak: The user presses a corresponding arcade button to speak the chosen sentence aloud using neural TTS
Support caregivers: A companion dashboard manages profiles, settings, and exports research logs

Example

Input: “Water… want”
Suggestions:
1. “I want some water.”
2. “Can you bring me a glass of water, please?”
3. “I am thirsty, can I get water?”

How we built it

We built Somatica as a hardware plus software system.

Frontend and UI

Next.js 16 for the main app and caregiver companion dashboard
Tailwind CSS for an aphasia optimized UI: high contrast, large tap targets, visual cues, and a strict limit of three options

Backend and AI pipeline

OpenAI Whisper for low friction speech to text
Anthropic Claude as the intent engine that expands fragmented speech into natural sentences
ChromaDB as a vector store to support retrieval augmented generation, so Somatica can pull relevant profile details and past conversation context

Hardware

Raspberry Pi running a lightweight Python controller
Arcade buttons for tactile, high contrast access
- Yellow records
- Red, blue, gray select options
Microphone and speaker integration for a dedicated communication box experience

Challenges we ran into

Audio latency: Time to speak is everything. If output takes too long, it stops feeling like conversation, so we streamlined the pipeline from transcription to generation to TTS.
Raspberry Pi audio: Getting reliable microphone and speaker routing on embedded Linux while also running a web server took real systems debugging.
Hallucinations: Early versions generated sentences that were fluent but not grounded in what the user said. We reduced this by tightening prompts, retrieving only relevant profile memory, and enforcing strict output constraints.
UX constraints: Aphasia friendly design is not just big buttons. It is about minimizing cognitive load, avoiding clutter, and keeping choices small while still giving control.

Accomplishments that we're proud of

A tactile interface that feels empowering and fun, not clinical
A complete end to end loop: press button, speak keywords, pick an option, hear it spoken out loud
Context awareness that improves disambiguation, for example mapping “wife” to the spouse name stored in the profile
A caregiver companion dashboard that turns configuration and research export into a simple workflow
An analyst style summary of conversation logs that can surface patterns like recurring pain mentions or higher frustration at certain times

What we learned

Context is king: Generic language models struggle with AAC. Personalization plus memory makes the same model dramatically more useful.
Constraints beat cleverness: Limiting to three grounded options reduces confusion and lowers hallucination risk.
Accessibility is product engineering: Latency, contrast, layout, and tactile input matter as much as model quality.
Integration is the hard part: Real world assistive tech lives or dies on reliability, not demos.

What's next for Somatica

Visual context: Add a camera so the system can incorporate what the user is pointing at, like a cup or TV remote
Voice banking: Let users speak using their own pre injury voice through personalized TTS
Clinical pilot: Work with Speech Language Pathologists to validate usability, safety, and caregiver dashboard usefulness
Resilience and privacy: Better offline tolerance, clearer data controls, and configurable retention for logs

Built With

ai
anthropic-claude
chromadb
chromadb-for-rag-memory
claude
css
hardware
llm
next.js
openai-text-to-speech
openai-whisper
pwa
python-for-the-pi-button-controller
rag
raspberry-pi
raspberry-pi-hardware
tailwind
tailwind-css
tts
whisper

Updates

ooreyes Reyes started this project — Feb 16, 2026 09:53 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.