Operation Ghost

Inspiration

Traditional input methods create barriers for people with motor disabilities, healthcare workers in sterile environments, and anyone seeking more natural human-computer interaction. We wanted to make computing accessible to everyone, regardless of physical limitations.

What it does

OpsGhost is an AI-powered gesture control system that lets you operate your computer using hand gestures and voice commands:

Gesture Control: Move cursor by pointing, click with pinch gestures, type on a virtual keyboard in mid-air
AI Vision Assistant: Google Gemini 2.0 sees your screen in real-time and executes voice commands
Real Desktop Automation: C++ Windows API for instant response with no simulations

How we built it

Tech Stack:

Computer Vision: MediaPipe Hand Landmarker (GPU-accelerated, 60 FPS)
AI: Google Gemini 2.0 Live API (Multimodal)
Desktop Control: C++ Windows SendInput API
UI: Electron (transparent overlay) + React + TypeScript + Vite
Backend: Node.js + Express bridge server
IPC: Custom protocol between Electron and C++ controller

Architecture: Multi-layer system with MediaPipe processing hand tracking at 60 FPS, Electron UI overlay, Node.js bridge server, and C++ executable for real-time desktop control (1-5ms response time).

Challenges we ran into

Cursor Drift During Gestures: Initial approach tracked fingertip → cursor moved when pinching. Solution: Track index finger knuckle for cursor position, tip only for click detection.
Performance Bottleneck: PowerShell was too slow (100-500ms delay). Solution: Custom C++ executable with Windows API achieved 1-5ms response time.
Multiple Click Detection: Gesture flickering caused repeated clicks. Solution: State machine with debouncing (250ms keyboard, 300ms mouse).
UI Click Issues: Overlay was click-through. Solution: Dynamic mouse event handling based on hover zones.