Inspiration

The inspiration started when understanding the situation of airline catering operations, specifically the Pick & Pack process at gategroup facilities. We discovered that workers manually verify hundreds of items across trolleys, drawers, and boxes for each flight—a process that's error-prone, time-consuming, and physically demanding. With over 50% of items often returning unused from flights and expiration dates checked manually, we saw an opportunity to transform this critical workflow using cutting-edge AR and AI technology.

We wanted to empower warehouse workers with tools that make their jobs faster, more accurate, and more engaging while reducing waste and improving sustainability in airline catering operations.

What it does

Opsight is an AR-powered trolley scanning and validation system for airline catering that addresses multiple dimensions of the Smart Execution pillar:

Real-Time Error Detection: Using ARKit and computer vision, Opsight scans trolley carts in 3D space, overlays a virtual model of the expected configuration, and instantly detects missing, misplaced, or incorrect items before the cart leaves the warehouse.

Expiration Date Management: The app uses on-device machine learning (MLX) to perform OCR on product labels, automatically extracting and validating expiration dates against flight departure times, eliminating hours of manual checking.

Employee Efficiency: Through AR guidance, workers receive real-time visual feedback with haptic and audio cues, making the packing process faster and more intuitive. The app tracks loading sessions and provides performance metrics to help workers improve over time.

The workflow is simple, workers select a flight, scan the trolley cart using an iPhone/iPad, and the app guides them through validation with visual overlays showing exactly where each item should be placed, highlighting errors in real-time.

How we built it

We built Opsight using a modern iOS tech stack optimized for performance and accessibility:

SwiftUI: For a native, responsive user interface that adapts to different devices and accessibility needs.

ARKit + RealityKit: For plane detection, 3D trolley visualization, and spatial tracking. We detect horizontal surfaces, allow users to place virtual trolley models, and track items in 3D space.

Vision Framework: For rectangle detection and initial item recognition from camera frames MLX (Apple's ML framework): For on-device text recognition (OCR) to extract expiration dates and batch numbers from product labels without sending data to the cloud.

Miscellaneous: React Vite + Javascript Node.js web application for operation purposes, providing an UI to monitor metrics, expiry alerts, and inventory movements; additionally, an integrated Gemini LLM agent that provides information and solves business-related doubts.

Challenges we ran into

  1. Real-time AR Performance: Processing camera frames at 60fps while running ML models was computationally intensive. We implemented frame skipping (processing every 3rd frame) and background queue processing to maintain smooth AR performance.

  2. 3D Spatial Mapping: Accurately mapping 2D bounding boxes from camera frames to 3D positions on the trolley required complex coordinate transformations between Vision's normalized coordinates and ARKit's world space.

  3. MLX Integration: Integrating Apple's MLX framework for on-device OCR required careful model optimization and batching strategies to extract text from various label formats, angles, and lighting conditions without sacrificing accuracy.

  4. Compartment Detection Logic: Determining which trolley compartment an item belongs to based on its position required implementing spatial heuristics that account for camera angle, distance, and the trolley's orientation.

  5. State Management Complexity: Managing the AR session lifecycle, plane detection, item scanning, and validation simultaneously required a robust state machine with proper error handling and recovery mechanisms.

Accomplishments that we're proud of

✅ Built a fully functional AR scanning system that can detect surfaces, place virtual trolley models, and track items in real-time ✅ Achieved real-time item validation with visual feedback through color-coded overlays (green for correct, red for errors, yellow for warnings) ✅ Implemented on-device ML using MLX for privacy-preserving expiration date extraction—no cloud processing required

What we learned

AR coordinate systems are complex: Mapping between 2D screen space, Vision's normalized coordinates, and ARKit's 3D world space requires careful transformation matrices and understanding of camera intrinsics.

On-device ML is powerful: MLX enables sophisticated computer vision tasks (OCR, object detection) directly on iPhone without latency or privacy concerns of cloud APIs

Performance optimization matters: Real-time AR requires aggressive optimization—frame skipping, background processing, and efficient rendering to maintain 60fps

What's next for Opsight

🚀 Advanced ML Models: Train custom CoreML models specifically for airline catering products to improve detection accuracy beyond generic object recognition.

🍾Alcohol Bottle Handling: Implement fill-level detection using depth sensors to automatically determine if bottles should be reused, refilled, or discarded based on airline-specific rules.

✈️ Consumption Prediction: Integrate historical flight data to predict item usage patterns and recommend optimal packing quantities, reducing waste and fuel costs.

🌐 Multi-trolley Sessions: Support scanning multiple trolleys in sequence with session persistence and batch validation

Built With

Share this project:

Updates