Lacia — An Autonomous SRE Agent for the Action Era

Inspiration

Every developer knows the pain: it’s 3 AM, your phone screams with a PagerDuty alert, and you have to wake up to fix a NullPointerException that could have been solved in five minutes.

We realized that while AI has become great at chatting about code, it hasn’t been trusted to fix it. Existing tools are passive — they just show you dashboards full of red lines.

We built Lacia to answer Google’s call for the Action Era.

We didn’t want another chatbot assistant.
We wanted a teammate who wakes up when the server crashes, fixes the bug, runs the tests, and has a Pull Request waiting for us by the time we wake up.


What It Does

Lacia is a self-hosted Autonomous SRE Agent that sits between your production logs and your repository.

  • Watches
    A lightweight Go watchdog tails your production logs in real time.

  • Detects
    When a crash occurs, it captures the stack trace and surrounding context.

  • Reasons
    It ingests your entire repository using Gemini 3’s 1M+ token context window to understand root causes across files.

  • Acts
    It spins up an isolated Docker sandbox to reproduce the bug.

  • Verifies
    It writes a fix and runs your test suite. If tests fail, it self-corrects and tries again.

  • Delivers
    Once tests pass, it opens a GitHub Pull Request with the fix.


How We Built It

Lacia is a monorepo built on a privacy-first architecture.

The Sentinel (Go)

A highly optimized, static Go binary (<10MB) that acts as the log watchdog.
It uses regex-based anomaly detection with near-zero resource overhead.

The Brain (Next.js)

The control plane that manages the agent lifecycle.
We use sql.js for persistent incident tracking without requiring complex database setup.

The Agent (Gemini 3 Pro)

The core intelligence of Lacia.

We abandoned traditional RAG in favor of Context Injection.
By feeding Gemini the full file tree and relevant source files directly, we achieved significantly higher debugging accuracy.

The Sandbox (Docker)

For safety, Lacia never touches production systems.
It uses ephemeral Docker containers to clone the repo, apply patches, and run tests in isolation.


Challenges We Ran Into

The “Infinite Loop” Problem

Early versions of the agent would repeatedly fail tests and retry the same fix.

We solved this with a Self-Correction History, where the agent sees its previous failed attempts in the context window, forcing it to try new strategies.

Safety & Trust

Letting an AI run shell commands is scary.

We implemented a strict Docker-based sandboxing layer and dynamically provisioned dependencies (Node, Python, Go), which became a serious DevOps challenge.

Prompt Engineering

Getting the model to strictly follow a Thinking → Action workflow required extensive system instruction tuning to prevent it from chatting instead of doing.


Accomplishments We’re Proud Of

  • It actually works
    We simulated a crash and watched the agent detect it, fix the bug, and merge the PR without a single human keystroke.

  • One-Command Demo
    A single command (go run . start) spins up a simulated environment so judges can test Lacia instantly.

  • Performance
    The Go watchdog consumes less than 1% CPU, making it production-ready.


Gemini 3 Integration

Lacia would not be possible without the capabilities of the Gemini 3 family.

1M+ Token Context Window

Traditional RAG fails at complex debugging because bugs often span multiple files.

Gemini 3 allows us to inject the entire relevant file structure directly into context, giving the agent global vision across frontend, backend, and shared dependencies.

Multimodal Reasoning & Thought Signatures

We use Thought Chains to force the model to plan before acting.

These reasoning traces power a Live Reasoning Dashboard, showing users why the agent made each decision.

Tool Use (Function Calling)

We built a custom Linux Operator toolset.

Gemini 3 doesn’t just generate text — it calls functions such as:

  • read_file
  • write_file
  • exec_shell
  • git_commit

This turns the model from a text generator into a true system operator.


What We Learned

Context Is King, RAG Is Retro

We initially considered using vector databases to query the codebase. We learned that Gemini 3’s 1M+ token context window is far superior for debugging because bugs are often “spooky action at a distance” — a change in file A breaks file B.

RAG frequently misses this subtle, cross-file context. Full repository ingestion catches it every time.

Agents Need Guardrails, Not Just Prompts

Giving an AI access to a shell is terrifying.

We learned the hard way that a robust Docker sandbox isn’t just a feature — it’s a requirement. We spent significant time architecting a networking layer that allows the agent to dynamically install dependencies without ever exposing the host machine.

The “Loop” Is Harder Than the “Fix”

Writing code to fix a bug is relatively easy for Gemini.

Writing the logic to verify the fix, parse test output, and decide whether to retry, pivot, or give up was the real engineering challenge. Autonomous systems require state machines, not just simple LLM calls.

Go + TypeScript = Dream Team

Using Go for the high-performance, low-footprint watcher and TypeScript for complex agent orchestration gave us the best of both worlds:

  • Raw speed at the edge
  • Rapid iteration in the brain

What’s Next for Lacia

  • Multi-Modal Inputs
    Use Gemini 3’s vision capabilities to analyze UI screenshots and correlate them with backend logs.

  • Slack / Discord Integration
    Approve fixes directly from chat notifications.

  • SaaS Release
    Transition from a self-hosted Docker deployment to a managed cloud service for easier onboarding.

Built With

Share this project:

Updates