Inspiration

We all love video games, but we often feel guilty about sitting still for hours. We asked ourselves: What if the game is the workout? Instead of building a boring "fitness app," we wanted to transform the games we already love: platformers, endless runners, and adventure games, into physically engaging experiences. XRcise was born from the desire to make movement fun, accessible, and seamlessly integrated into our digital lives.

What it does

XRcise is a universal motion controller that uses your webcam to translate real-world body movements into keyboard inputs. Jump in real life -> Your character jumps (Up Arrow / Space). Squat down -> Your character ducks or slides (Down Arrow). Lean Left/Right -> Your character steers or strafes.

How we built it

We built a high-performance backend using Python and OpenCV, leveraging Google's MediaPipe for state-of-the-art pose estimation.

Backend: The core engine analyzes 33 skeletal landmarks in real-time. We implemented complex vector math to detect gestures (jumping, squatting) robustly. Frontend: A modern React dashboard connects via WebSockets to visualize the tracking, manage calibration, and configure sensitivity settings. Input Simulation: To support fast response time, we wrote custom low-level input drivers using pynput (macOS) and ctypes/SendInput (Windows) to mimic hardware signals directly.

Challenges we ran into

Latency Battle: Early versions felt "laggy." We discovered that standard camera reading blocks the main thread. We engineered a Threaded Camera Class that runs in the background, aggressively grabbing the latest frame so the game engine is never waiting on the webcam.

Cross-Platform Nightmares: Windows games often ignore standard Python key presses due to DirectInput/anticheat protections. We had to dive deep into Windows API documentation to implement a DirectInput Injection system that "tricks" games into seeing real hardware events.

The "Drift" Problem: Players naturally shift around the room while playing. Initially, users would drift out of the "center" zone. We solved this with an Aggressive Ratchet Algorithm that dynamically updates the center point based on player intent, making the controls feel "telepathic" rather than rigid.

Accomplishments that we're proud of

It actually feels good to play: Most webcam games feel clumsy. By implementing "Instant Reset," we achieved a control scheme that feels tight enough for precision platformers. Running on the Edge: The entire pipeline runs locally on CPU with minimal footprint, respecting user privacy (no video leaves the device).

Visual Feedback: The real-time skeleton overlay with dynamic "safe zones" makes debugging and playing incredibly intuitive.

What we learned

User Intent != Raw Data: Raw extensive motion data is noisy. We learned to filter "jitter" vs. "intent" (like holding a squat versus just fidgeting).

Threading is Hard: Managing a GUI (OpenCV), a WebSocket Server, and a Camera Feed on separate threads taught us hard lessons about race conditions and event loops (especially the difference between macOS and Windows handling!).

What's next for XRcise

Gesture Expansion: Adding punch/kick detection for fighting games. Profile System: Cloud-saved configurations optimized for specific games (e.g., "Minecraft Profile", "Mario Profile"). Voice Commands: "Pause Game" or "Recalibrate" via voice. VR Integration: Bringing full-body tracking to simple VR setups without expensive hardware.

Built With

Share this project:

Updates