Laila

system architecture
architecture ideation
gemini prompting structure

Inspiration

We all know how important it is to have time for meaningful work and to maintain a balanced life. But when we discovered that 40% of workers spend a quarter of their work week on manual, repetitive tasks—like logging emails, updating spreadsheets, and managing schedules—it really caught our attention.

These tasks, while small, add up over time, leading to increased burnout and reduced job satisfaction. We felt there had to be a better way—a way to automate these mundane processes so that people can focus on the work that truly matters to them.

This insight drove us to create a solution that not only enhances productivity but also helps individuals reclaim their time and maintain a healthier work-life balance.

What it does

Introducing LAILA, an intelligent digital assistant designed to make your workday easier. LAILA automates repetitive web tasks in real-time, allowing you to focus on what truly matters and boosting your productivity effortlessly.

By combining advanced action models, voice technology, and generative AI, LAILA takes your voice commands or input, interprets them, and transforms them into clear, actionable steps. These tasks are then carried out instantly and displayed on your screen, so you can track the progress as it happens—all without lifting a finger.

In essence, LAILA turns tedious tasks into simple, automated actions, helping you reclaim your time and work smarter, not harder.

How we built it

We developed a comprehensive system architecture diagram to illustrate the communication flow and interaction between various components of our solution. Description

Here's the tech stack:

Front End

Next.js and React: For building a responsive, dynamic user interface.
Tailwind CSS and Shadcn: For efficient and customizable styling.
TypeScript: Ensures type safety and scalability.

Back End

Twilio: Enables seamless calling capabilities.
Deepgram: Utilized for speech-to-text processing.
SingleStore: A real-time data management system for fast analytics.
Apache Kafka: Manages real-time event streaming.
Perplexity and Gemini: For interpreting tasks and generating instructions using AI.
Phoenix: Evaluates and observes LLM (Large Language Model) performance.
Selenium: Automates browser tasks for efficient web interactions.

Challenges we ran into

Integrating multiple APIs together cohesively
Getting the Gemini AI to generate clear, accurate, and actionable instructions
Connecting Kafka and SingleStore
Ensuring real-time performance

Accomplishments that we're proud of

Developed a working MVP that uses an advanced action model to automate web tasks.
Integrated a diverse range of technologies—like AI, voice recognition, and real-time data handling—into a cohesive, functioning system.

What we learned

How critical it is to thoroughly read and understand API documentation.
The potential of action models for automating more complex web flows and pipelines.
Working with a diverse tech stack and multiple APIs taught us the importance of teamwork and adaptability.