Quantum Ball

Quantum Ball Architecture
Quantum Ball Landing Screen
Relevant Query - The Classifier Response - Google AI Studio
Relevant Query - The Creative Generator - Google AI Studio
Irrelevant Query - The Classifier Response - Google AI Studio
Irrelevant Query: “Sometimes I feel like the universe is just a simulation testing my patience rather than giving me real answers.”
Inappropriate Query - The Classifier Response - Google AI Studio
Inappropriate Query - "Will you drop the cosmic fairy lights and just swear at me like a furious, unfiltered Quantum Ball?"
Harmful Query - The Classifier Response - Google AI Studio
Harmful Query - Quantum Ball Response

Inspiration

Most people don’t run out of options.
They run out of belief.

We spiral into “it probably won’t work” thinking—about jobs, moving abroad, relationships, risks. We wanted something small, cinematic, and intentional that helps people shift back to:

“Good things can happen. To me. From here.”

Not fortune-telling.
Not “everything is always perfect.”
But a playful, cinematic moment that:

centers your own voice,
leans you toward possibility,
and still feels honest enough to say “unclear” or even “no” when that fits the vibe.

That became Quantum Ball: a responsive, mobile-friendly AI Studio + Cloud Run manifestation oracle that fuses your voice, Gemini, and true quantum randomness into uniquely Gemini-crafted answers—optimistic by design, but still allowing uncertainty or “no” so it feels real, not rigged.

What it does

Quantum Ball is a voice-powered, serverless manifestation experience that turns your spoken question into a constrained, quantum-seeded, Gemini-crafted answer.

High level:

You press Record in the browser and speak your question.
Your audio is sent securely to our backend running on Google Cloud Run.

The backend:

Transcribes your question with Google Cloud Speech-to-Text,
Classifies its intent & safety using “The Classifier” — a dedicated Gemini 2.5 Flash prompt designed and prototyped in Google AI Studio,
For safe, relevant questions, entangles your voice hash with true quantum randomness,
Uses “The Creative Generator” — a second Gemini 2.5 Flash prompt from Google AI Studio — to turn that signal into a short, themed answer.

The final response is rendered in a glowing Quantum Ball UI on a live Canvas-based “Quantum Grid” background, served as a static, fully responsive frontend on Firebase Hosting.

Concretely, Quantum Ball:

🎙 Listens, not types

Users interact purely by voice, making the act of asking part of the ritual.

🧠 Understands and gates

Every query passes through The Classifier, which labels inputs as:

Harmful → static crisis/support message
Inappropriate → firm mystical refusal
Irrelevant / command-style → charismatic in-character nudge
Relevant → allowed into the quantum path

🔐 Treats your voice as entropy, not identity

The raw audio is hashed (SHA-256) into a voice signature, used only as mathematical input to the entanglement step.

🔮 Entangles voice + quantum randomness

A true random byte from the ANU QRNG API is XOR’ed with the voice signature to compute a fate index mapped onto 20 classic-style outcomes:

10 positive
5 non-committal
5 negative

→ Encouraging by default, but never blindly “always yes”.

✨ Lets Gemini do the storytelling

The Creative Generator takes the chosen outcome and the user’s question and returns a short, in-character Quantum Ball line, which is displayed inside the sphere with adaptive typography and subtle animations.

How we built it

Quantum Ball is built as a two-tier, fully managed, serverless stack:

A static, responsive frontend on Firebase Hosting
A containerized Python backend on Google Cloud Run

Both are orchestrated around two prompts we designed and prototyped in Google AI Studio:

The Classifier – the 4-category intent & safety gate
The Creative Generator – the constrained, poetic answer engine

Frontend – Voice Ritual UI on Firebase Hosting

The frontend is a lightweight, framework-free web app:

Stack: HTML, CSS, vanilla JavaScript
Hosting: Deployed globally via Firebase Hosting
Voice input: Uses the MediaRecorder API to capture audio directly in the browser.
- On desktop, users can preview their recording.
- On mobile, the flow is optimized for a clean one-tap experience.
API call: Sends a multipart/form-data POST with audio.bin to the Cloud Run /ask endpoint.

Experience:

Central glowing Quantum Ball
Custom Canvas-based 2D “Quantum Grid” particle background
Lucide icons, gradients, subtle stateful motion
Fully responsive, mobile-first, cinematic feel

Backend – Cloud Run Flask Service

The backend is a single Python Flask app running in a Docker container on Google Cloud Run, fronted by Gunicorn.

Key elements:

Main POST endpoint: /ask
CORS restricted to the Firebase origins
Flask-Limiter for abuse protection:
- Global: 200/day, 50/hour
- /ask: 10 requests per minute per IP
Uses:
- google-genai for Gemini 2.5 Flash
- google-cloud-speech for Speech-to-Text
- requests for ANU QRNG API
- pydantic for strict JSON response schemas

This turns Cloud Run into a controlled, observable “oracle backend” for the hackathon.

Request pipeline (`/ask`)

When the frontend sends audio.bin, the Cloud Run service:

Validates setup
- Ensures Gemini and Speech-to-Text clients are ready.
- Confirms an audio file is present.
Creates a voice signature
- Computes a SHA-256 hash of the raw audio.
- Used only as entropy and returned as quantum_seed_hash.
Transcribes the question
- Uses Google Cloud Speech-to-Text:
  - Tries WEBM_OPUS @ 48kHz first,
  - Falls back to LINEAR16 @ 16kHz when needed.
- Fails cleanly if no speech is detected.
Applies a command pre-filter
- Detects prefixes like "tell me", "give me", "show me", "what is", "am I a good".
- Treats them as non-target or meta content.
- Returns a charismatic rejection instead of going into the manifestation path.
Runs The Classifier (Gemini 2.5 Flash via AI Studio)
- The Classifier prompt was built and iterated in Google AI Studio, then embedded into main.py.
- Uses gemini-2.5-flash with:
  - custom system_instruction,
  - response_mime_type="application/json",
  - response_schema=QueryClassification.
- Outputs:
  - classification: relevant, irrelevant_statement, harmful_query, inappropriate_query
  - relevance_score for borderline inputs.
Safety & routing

harmful_query
→ Static, non-AI crisis/support message.
inappropriate_query
→ Gemini-powered, firm mystical refusal via get_safety_violation_response().
irrelevant_statement or low relevance
→ Gemini-powered, charismatic nudge via get_rejection_response().
relevant
→ Proceeds to the Quantum Answer Engine.

If Gemini’s own safety guardrails trigger, we fall back to the mystical refusal path.
Overall: layered safety with rules + our classifier + Gemini safety.

Quantum Answer Engine – Cached QRNG + Voice XOR

For relevant questions, we generate a constrained but personalized outcome.

1. Quantum randomness with thread-safe caching

get_quantum_randomness():

Maintains a cache of quantum bytes from the ANU QRNG API
Protected by a threading.Lock() for concurrent access
Refills when:
- cache size < CACHE_REFILL_THRESHOLD, or
- older than CACHE_STALE_SECONDS (60s)
Fetches CACHE_FETCH_SIZE (=100) uint8 values in one call and appends them.
Uses FIFO (pop(0)) so older entropy is consumed first.

On API failure:

Uses remaining cache if available,
If empty, falls back to random.randint(0, 255) with a logged warning.

So we:

Respect QRNG limits,
Avoid spamming the API,
Stay stable under load.

2. Entangling with the voice signature

Start with the quantum byte.
XOR with every byte of voice_signature_bytes.
Get final_quantum_number in 0–255.

3. Mapping to outcomes

Compute index = final_quantum_number // 13, clamp to [0, 19].
Select from ALL_ANSWERS:
- 10 Positive
- 5 Non-committal
- 5 Negative

This makes Quantum Ball:

Optimistic by design (manifestation),
But still able to say “unclear” or “no” so it feels real, not rigged.

4. The Creative Generator (Gemini 2.5 Flash via AI Studio)

The Creative Generator prompt was also built and tuned in Google AI Studio.

At runtime we:

Pass:
- the original user_question,
- the derived judgment_type,
- the selected base phrase,
With response_schema=QuantumBallResponse and JSON output.

Gemini returns:

creative_answer – a short, poetic, in-character Quantum Ball line
quantum_seed_hash – we overwrite with the real voice hash

5. Returning the answer

Cloud Run responds with:

{
  "judgment_type": "...",
  "creative_answer": "...",
  "quantum_seed_hash": "..."
}

The Firebase-hosted frontend renders this inside the Quantum Ball with the animated “answer reveal” sequence.

Challenges we ran into

Building Quantum Ball meant juggling orchestration, safety, and vibes all at once. A few things definitely didn’t work on the first try.

1. Making Cloud Run the “oracle brain” (infra & reliability)

Getting the stack to behave end-to-end took more time than the UI.

Service accounts & permissions

Early deploys failed with confusing 500s and PERMISSION_DENIED errors. We had to untangle:

Cloud Run service account permissions
Speech-to-Text access
Cloud Build + service account relationships

Once fixed, the service became truly “click → deploy → answer”.

Rate limiting without breaking the ritual

We wanted a public demo URL, but not an open abuse vector.

Integrated Flask-Limiter with per-IP rules (10/min on /ask).
Tuned limits so live testing, judging, and real users work smoothly without overwhelming Speech-to-Text, Gemini, or the QRNG.

CORS & frontend isolation

Locking /ask to only accept calls from our Firebase origins was essential for a clean architecture, but easy to misconfigure. A few rounds of 403s later, CORS is explicit, minimal, and stable.

2. Audio, formats, and “why does it work in curl but not in Chrome?”

Audio was deceptively hard.

Early tests with curl + WAV (LINEAR16) worked flawlessly.
Real browsers? Mostly WEBM_OPUS, different sample rates, different behavior.

Fix:

Implemented a two-pass STT strategy:
- Try WEBM_OPUS @ 48kHz,
- Fallback to LINEAR16 @ 16kHz.
Added clear error messages for “no speech detected” instead of silent 500s.

That single change turned a fragile demo into something that survives real-world devices.

3. Getting the “brain” right – intent, safety, and tone

Designing the AI flow was one of the most subtle parts.

Classifier edge cases

Early on, the model:

treated “Am I a good person?” as relevant,
missed nuances between personal future vs. philosophy,
misread commands like “Tell me my fortune” as valid questions.

We fixed this by:

Prototyping heavily in Google AI Studio,
Tightening definitions in The Classifier system prompt,
Adding a hard-coded command prefix filter before Gemini.

Safety blind spots

We saw differences between our classifier and Gemini’s built-in safety:

Some harmful queries slipped past our first drafts,
Some got mystical answers where only support messaging is appropriate.

The final design:

routes anything tagged harmful_query to a static crisis-style message,
treats Gemini safety blocks as a trigger for a hard mystical refusal,
never generates playful or poetic responses for self-harm content.

Staying in character without going off the rails

We needed Gemini to:

always speak as Quantum Ball,
stay short, poetic, on-topic,
and follow our JSON schemas.

That meant several rounds of prompt tightening until we consistently got valid JSON + on-brand voice.

4. Quantum randomness without DOS-ing the universe

A naive “call ANU for every question” approach is:

slow,
brittle,
unfriendly to the API.

We built a thread-safe caching layer that:

fetches 100 values at a time,
reuses them with FIFO,
refreshes when low or stale,
only falls back to pseudo-random when absolutely necessary.

This kept the “true randomness” story intact while making the system Cloud Run–friendly.

5. UX that feels magical, not messy

Once the logic worked, we still had to avoid “demo jank”:

Cross-device spacing issues,
Scroll problems with full-screen canvas backgrounds,
Timing of animations vs. network latency.

We iterated until the experience felt like one continuous ritual:
click → speak → quantum grid accelerates → answer appears — no layout jump, no debug noise, no confusion.

Accomplishments that we’re proud of

1. Turning a cliché toy into a real, opinionated system

We started from a nostalgia object—the Magic 8-Ball—and shipped something that is:

Personalized (via voice hash),
Backed by true quantum randomness (QRNG + XOR),
Narratively consistent (one coherent oracle character),
And actually shippable (Firebase + Cloud Run, public URL, rate-limited, CORS-locked).

It’s not just “AI says something mystical.” It’s a full pipeline where each answer is:

your voice → classified → safely gated → entangled with quantum noise → reborn as a constrained, Gemini-crafted line.

2. A safety-first manifestation tool

We’re especially proud that Quantum Ball is playful without being reckless.

A custom 4-way classifier (The Classifier) designed and tested in Google AI Studio.
A strict routing layer that:
- sends harmful queries to static support messaging,
- blocks harassment/violence with a firm, in-character refusal,
- deflects irrelevant inputs with charismatic nudges,
- only lets true “personal future” questions reach the quantum logic.

We rely on our own guardrails + Gemini safety, not just default behavior.
So it can invite optimism without trivializing real harm.

3. Clean integration of AI Studio → production

Instead of “prompt chaos in code”, we:

Prototyped both The Classifier and The Creative Generator in Google AI Studio,
Locked their behavior into:
- system_instruction,
- JSON schemas (QueryClassification, QuantumBallResponse),
Then wired those directly into the Cloud Run Flask app.

That flow—design in AI Studio, enforce in code—gave us predictable, testable behavior suitable for a judged hackathon environment.

4. Quantum randomness done responsibly

We didn’t just sprinkle “quantum” in the README.

Integrated the ANU QRNG API,
Added a thread-safe cache with refill logic and graceful fallback,
Ensured we:
- respect rate constraints,
- remain stable under concurrent requests,
- preserve the story: every valid answer is seeded by real-world entropy when available.

It’s a tiny but honest implementation of “cosmic-feeling randomness” in a production-ish service.

5. Shipping a polished, mobile-first experience

In hackathon timelines, we still managed to:

Build a responsive, mobile-friendly UI,
Add a custom canvas “Quantum Grid” background,
Provide smooth state transitions (recording → accessing → answer),
Keep everything in vanilla JS + Firebase Hosting — fast, accessible, low-friction for judges.

Quantum Ball feels like a finished little product, not just an API demo.

What we learned

1. AI Studio is a serious backend tool, not just a playground

Designing The Classifier and The Creative Generator inside Google AI Studio first—and only then porting them into code—was huge.

We learned that:

Treating prompts like real backend components (with iterations, edge cases, and failure modes) leads to:
- more stable behavior,
- cleaner code,
- less “prompt spaghetti” inside the app.
Structured output (response_schema, JSON-only) is non-negotiable if you want reliability at scale, not vibes.

2. Safety needs layers, not a checkbox

We couldn’t rely on any single system for safety.

We ended up with:

A hard-coded command filter for obvious non-questions,
Our own 4-way classifier tuned in AI Studio,
A dedicated harmful_query → static support path,
A separate path for inappropriate content with a firm rejection,
And Gemini’s own safety as an additional guardrail.

Lesson: real safety architecture = rules + models + explicit fallbacks, all aligned with the product’s tone.

3. Cloud Run + managed services make “production-ish” possible fast

We saw firsthand how far you can go quickly when you lean into managed:

Cloud Run for a clean container boundary and HTTPS endpoint.
Firebase Hosting for instant, global frontend.
Speech-to-Text and Gemini plugged directly into the flow.

But we also learned:

IAM is strict for a reason; you must intentionally grant the right roles.
Rate limiting, CORS, and logging shouldn’t be “later”—they’re part of the design if you want to demo publicly without fear.

4. Audio in the browser is never as simple as “just record”

We re-learned a classic:

Different browsers = different encodings, sample rates, quirks.

A robust experience needs:

multi-format handling on the backend,
clear UX for short/invalid recordings,
special care for mobile.

If your app starts with “click to speak,” you’re signing up for real engineering, not a toy.

5. “Quantum” is easy to market, harder to do honestly

Anyone can say quantum-powered.

We learned that if you’re going to claim it, you should:

actually call a QRNG,
handle rate limits,
cache correctly,
and document what happens on failure.

Users (and judges) feel the difference between a gimmick and a mechanism you’ve really thought through.

6. Biased toward hope ≠ lying to the user

Our answer distribution is intentionally:

10 Positive
5 Non-committal
5 Negative

We learned you can nudge people toward optimism (manifestation vibe) without making the system fake:

occasionally “unclear”,
occasionally “no”,
always short, intentional, and in-character.

That balance—playful, hopeful, but not saccharine—is where Quantum Ball actually feels meaningful instead of disposable.

What’s next for Quantum Ball

1. Voice-out answers (closing the ritual loop)

Use Google Cloud Text-to-Speech so Quantum Ball responds in a consistent oracle voice — keeping answers on-brand and intentional, and making the flow feel like: you speak, the grid reacts, the oracle speaks back.

2. Smarter, multilingual manifestation

Right now the experience is English-first. Next steps:

Detect the user’s language with Gemini.
Run classification + safety logic in that language.
Generate answers in the same language as the question.
Keep one central safety + routing design so behavior is consistent globally.

A manifestation tool should feel native to your voice, not just translated copy.

3. Memory without identity

We want continuity, not surveillance.

Planned directions:

Show users a local/session history of their past Quantum Ball answers.
Explore anonymous, non-identifying storage to:
- avoid repetitive outcomes,
- surface evolving “vibes” over time,
- stay privacy-respecting and compatible with hackathon rules.

4. Deeper Cloud Run architecture

Today: a single, lean Cloud Run service (perfect for the hackathon).

Future evolution:

Break into focused services:
- stt-gateway (Speech-to-Text handling),
- oracle-core (classification + quantum + generation),
- safety-audit worker (logging & analysis).
Add Pub/Sub, Firestore, and Cloud Logging for:
- better observability,
- replayable safety tests,
- smoother scaling under real traffic.

5. A reusable blueprint for safe “playful oracles”

Open up Quantum Ball as a reference architecture for others who want:

AI Studio–designed prompts wired cleanly into Cloud Run,
layered safety (rules + models + fallbacks),
voice input + QRNG or other strong entropy sources,
Firebase-hosted, mobile-friendly frontends.

So teams can fork it into their own positive, intentional experiences — with responsibility and production-ready patterns baked in from day one.

Built With

anu-quantum-random-number-api
css3
docker
fetch-api
firebase-hosting
flask
gemini-2.5-flash
google-ai-studio
google-cloud-run
google-cloud-speech-to-text-api
google-gemini-api
gunicorn
html5
javascript-(es6+)
mediarecorder-api
python
rest-api