Data Generation Pipeline
For each collected exoskeleton demonstration, we execute the following data processing pipeline:
# Data Generation Pipeline for DexUMI
for demo_video in exoskeleton_demonstrations:
# Step 0: Synchronize multi-modal data streams
sync_data_sources(wrist_camera, encoder_readings, wrist_pose,
tactile_sensor)
# Step 1: Record robot hand video by replaying actions
robot_video = replay_on_robot(encoder_readings)
# Step 2: Resize exoskeleton&robot hand videos
# Step 3: Segment hands from both video streams
exo_mask = segment_exoskeleton(demo_video)
robot_mask = segment_robot_hand(robot_video)
# Step 4: Remove exoskeleton and inpaint background
clean_background = inpaint_video(demo_video, exo_mask)
# Step 5: Composite final high-fidelity manipulation video
final_video = composite_videos(clean_background, robot_video,
robot_mask, exo_mask)
save_training_data(final_video, action_labels)
This pipeline transforms raw exoskeleton demonstrations into high-quality training data by
removing the human operator and exoskeleton hardware while preserving the nature occlusion
between hand and object.
cd DexUMI/real_script/data_generation_pipeline
# Sync different data source and replay the exoskelon action on robot hand.
# This cover the Step 0 and Step 1 in the data generation pipeline Pseudocode.
# Modify the DATA_DIR, TARGET_DIR and REFERENCE_DIR before running
./process.sh
# Run data processing pipeline
# This cover the Step 1, 2, 3, 4 and 5 in the data generation pipeline Pseudocode.
# Modify the config/render/render_all_dataset.yaml before running
python render_all_dataset.py
# Generate the final training data
python 6_generate_dataset.py -d path/to/data_replay -t path/to/final_dataset --force-process
total --force-adjust
Segmentation Setup (finish it before actual data collection/generation)
To achieve automatic segmentation of the exoskeleton and robot hand, you need to configure prompt
points before starting data collection and processing. This is a one-time setup process.
Follow these steps to complete the setup:
- Collect Reference Episode: Wear the exoskeleton and collect one initial
episode. Ensure your hand and exoskeleton are clearly visible in the first few frames,
with the hand in a fully open and comfortable pose. This episode will serve as your
reference for all future data collection.
- Generate Robot Replay: Replay the collected episode on the robot hand
to create corresponding robot hand video.
- Create Segmentation Prompts: Set up prompt points for both the
exoskeleton and robot hand segmentation. Save these prompt points to the reference
episode for consistent use across all future collections.
Check the video below for detailed instructions on setting up prompt points and configuring
the
reference episode.
Tips for Better Segmentation Results:
- Color consistency: Wear gloves that match the exoskeleton color to
improve detection accuracy
- Prompt point optimization: Experiment with different positive and
negative prompt points, as results can vary significantly based on placement
- Sparse prompting: Use fewer, well-placed prompt points rather than
dense coverage for better results
- Background exclusion: Place negative prompt points on background
regions to prevent SAM2 from including unwanted areas
- Region-based segmentation: Divide the exoskeleton/robot hand into
separate regions (thumb, fingers, pinky) with dedicated prompt points for each, then
combine masks later
Data Collection Guide
We visualiuize the prompt points on the wrist camera image. Make sure to adjust your hand and
exoskelton to fully cover the prompt points. We also visualize the current encoder reading on the
image. The text become red when the encoder reading is not aligned with the prompt points.
cd DexUMI/real_script/data_collection/
# If you do not have a force sensor installed, simply omit the -ef flag.
# Create REFERENCE_DIR before running
python record_exoskeleton.py -et -ef --fps 45 --reference-dir
/path/to/reference_folder --hand_type xhand/inspire --data-dir /path/to/data