Peak Performance

Inspiration

Public speaking is often cited as one of the most common fears, yet it's a critical skill for success in almost every field. While there are many tools for "writing" better presentations, there are very few for "delivering" them. We wanted to build an AI-powered coach that provides the kind of objective, granular feedback that usually requires a human coach.

What it does

Peak Performance (Presentation Grader) is an iPad-optimized application that serves as a personal communication lab. It records your presentation and uses a sophisticated multi-modal AI pipeline to grade your performance.

Voice Analysis: Measures clarity, pace (WPM), volume consistency, and detects filler words ("um," "uh," "like").
Video Analysis: Uses computer vision to track eye contact, posture, and hand gestures.
Hybrid Scoring: Combines these metrics into an overall "A-F" grade with actionable, personalized feedback for improvement.

How we built it

We built the app using Cursor, Swift, and Xcode, leveraging a "Hybrid AI" architecture to balance speed and accuracy:

The Voice Pipeline: We combined Apple's SFSpeechRecognizer for real-time feedback, WhisperKit for high-accuracy on-device ML grading, and the AssemblyAI API for deep cloud-based transcription.
The Video Pipeline: We utilized Apple’s Vision Framework and AVFoundation to perform real-time face detection, pose estimation, and gesture recognition.
The Orchestrator: A robust PresentationViewModel managed the complex asynchronous flow between local hardware and cloud APIs using Swift’s Async/Await and Actors.

Challenges we ran into

Concurrency & Performance: Running heavy ML models (WhisperKit) alongside real-time video processing (Vision) on an iPad was resource-intensive. We had to implement strict memory management and actor isolation to prevent UI lag.
Model Cold Starts: WhisperKit models are large. Designing an intuitive "AnalysisView" that keeps the user engaged while models download and process was a UX challenge.
Permission Spiral: Handling simultaneous access to the camera, microphone, and speech recognition libraries required a robust PermissionsManager to ensure a smooth onboarding flow.

Accomplishments that we're proud of

Hybrid Transcription: Successfully syncing three different speech technologies to work as a unified grading engine.
On-Device Privacy: Processing significant portions of the body language and voice data directly on the iPad, keeping user data secure.
Intuitive UI: Creating a "ResultsView" that turns complex data (like pose estimation coordinates) into easy-to-understand metrics for the user.

What we learned

Computer Vision Nuance: We learned that "confidence" in a presentation can actually be quantified through metrics like shoulder stability and hand-gesture frequency.
Swift Package Management: Integrating complex external dependencies like WhisperKit taught us a lot about modern iOS build configurations.
ML Accuracy vs. Latency: We learned when to use on-device models for speed and when to offload to the cloud (AssemblyAI) for maximum linguistic accuracy.

What's next for Peak Performance

Historical Tracking: A dashboard to show how a user’s "Filler Word" count decreases over months of practice.
Live Nudges: Implementing haptic feedback or visual cues that subtly alert the presenter if they are speaking too fast or looking away from the camera in real-time.

Built With

assemblyai
cursor
swift
xcode

Submitted to

Hacklahoma 2026
- Winner Best Beginner - Anker Speaker

Created by

I worked on the Front End UI. I envisioned and implemented how the app would look on the user side and what applications it should include. I also was one of the person to come up with the idea for the app altogether.

star tran
Braden Ho
RyanHuynh11 Huynh
Brandon Ho
Mechanical Engineering @ OU

Updates

Brandon Ho started this project — Feb 08, 2026 12:12 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.