Inspiration

Important decisions inside teams happen live during standups, screen shares, and technical discussions. However, most AI agents operate asynchronously. They read Slack, scan documents, and respond only after something has already been written down.

That creates a gap. The AI misses the moment where context is formed, tradeoffs are debated, and decisions are actually made.

We wanted to solve that gap. The problem we’re addressing is simple: AI agents are not present in the live conversations where real work and immediate feedback happen.

What it does

Aaron is our AI employee. Until now, he could only react to written messages and updates. He had no awareness of live discussions.

Today, Aaron can join meetings, follow conversations in real time, understand decisions as they happen, and take action immediately. He can detect bugs during customer calls, fix them on the fly, and create a pull request automatically. Instead of waiting for someone to summarize outcomes, he processes context as it forms and contributes directly during the workflow.

He moves from being an async assistant to a real-time participant, which has the potential to significantly improve team productivity.

How we built it

We built new skills inside OpenClaw that allow Aaron to join a Google Meet automatically when provided with a meeting link. Using Playwright extensions, Aaron launches Chrome, opens the meeting URL, and joins the call just like a human participant.

For audio routing, we used BlackHole to capture and stream system audio in real time. This allows Aaron to receive live meeting audio and send audio output back when needed.

For real time voice understanding and transcription, we integrated Modulate. Modulate processes the streamed audio and converts it into structured text that Aaron can reason over. This enables live context awareness instead of relying on post-meeting summaries.

We also used Gemini Live for real time visual reasoning.

Finally, we used Flora to design Aaron’s visual identity so he can appear as a recognizable team member when joining meetings, rather than being an invisible background agent.

Challenges we ran into

  • Real time streaming and model coordination

Handling live audio inside a meeting is complex. Audio has to be captured, streamed to transcription, processed by reasoning models, and then potentially sent back into Google Meet as a response. If different models are used for transcription and reasoning, coordinating them adds latency and system complexity. The full loop needs to happen fast enough to feel natural.

  • Deciding when to act

Not every comment in a meeting should trigger a pull request or task. Aaron has to determine whether something mentioned is actually a bug, a real issue, or just discussion. Designing the logic for when to act versus when to observe was one of the hardest challenges.

  • Maintaining long-term context

To be truly useful, Aaron needs memory across meetings. He should understand what happened yesterday, last week, and how it connects to the roadmap. Each meeting should refine his behavior. Building toward continuous self-improvement, rather than a stateless meeting bot, is an ongoing challenge.

Accomplishments that we're proud of

In just a few hours, we enabled Aaron to join meetings, process conversations in real time, and generate actionable outputs during the call itself. Transitioning from an async assistant to a real-time participant significantly increased his usefulness and changed how we think about AI teammates.

What's next

Next, we want Aaron to develop stronger memory across meetings, smarter decision thresholds, and a deeper understanding of our codebase and workflows. Our long term goal is to continue evolving Aaron into a reliable AI teammate that improves with every interaction and becomes a natural part of our daily operations.

Built With

Share this project:

Updates