Episode Augmentation Tutorial¶
Learn how to apply VisionPack augmentations to specific episodes with camera selection and synchronized views.
Overview¶
Episode augmentation allows you to:
- Target specific episodes by name or index
- Select camera streams for multi-camera setups
- Apply synchronized augmentations across cameras
- Backup and restore original data safely
- Maintain dataset format for seamless integration
Quick Start¶
Basic Episode Augmentation¶
# List available episodes
dataphy augment dataset --dataset-path ./dataset --list-episodes
# Augment first episode, specific camera
dataphy augment dataset \
--dataset-path ./dataset \
--config aug.yaml \
--episode 0 \
--cameras observation.images.webcam
# Augment all cameras (synchronized)
dataphy augment dataset \
--dataset-path ./dataset \
--config aug.yaml \
--episode 0
Configuration¶
Basic Configuration¶
Create aug.yaml:
version: 1
pipeline:
# Synchronize augmentations across all cameras
sync_views: true
steps:
- name: random_crop_pad
keep_ratio_min: 0.88 # Preserve 88% of image area
- name: color_jitter
magnitude: 0.15 # 15% color variation
- name: cutout
holes: 1
size_range: [8, 16] # Small occlusion patches
background:
adapter: none
seed: 42
Available Transforms¶
| Transform | Purpose | Key Parameters |
|---|---|---|
random_crop_pad |
Spatial variation | keep_ratio_min (0.0-1.0) |
random_translate |
Position shifts | px (pixels) |
color_jitter |
Lighting changes | magnitude (0.0-1.0) |
random_conv |
Texture effects | kernel_variance (0.0-1.0) |
cutout |
Occlusion simulation | holes, size_range |
Domain Randomization Transforms¶
For advanced realism, use domain randomization transforms:
| Transform | Purpose | Key Parameters |
|---|---|---|
lighting_rand |
Realistic lighting simulation | ambient_tint, directional_intensity |
camera_intrinsics_jitter |
Camera calibration variations | fx_jitter, cx_jitter_px |
camera_extrinsics_jitter |
Camera mounting uncertainties | rot_deg, transl_px |
rgb_sensor_noise |
Realistic sensor noise | shot_k, read_sigma, iso_range |
Advanced Usage¶
Episode Selection¶
# By index (0-based)
dataphy augment dataset --dataset-path ./dataset --config aug.yaml --episode 0
dataphy augment dataset --dataset-path ./dataset --config aug.yaml --episode 5
# By name
dataphy augment dataset --dataset-path ./dataset --config aug.yaml --episode episode_000000
dataphy augment dataset --dataset-path ./dataset --config aug.yaml --episode episode_000005
Camera Selection¶
# Single camera
dataphy augment dataset \
--dataset-path ./dataset \
--config aug.yaml \
--episode 0 \
--cameras observation.images.webcam
# Multiple cameras
dataphy augment dataset \
--dataset-path ./dataset \
--config aug.yaml \
--episode 0 \
--cameras "observation.images.webcam,observation.images.laptop"
# All cameras (default)
dataphy augment dataset \
--dataset-path ./dataset \
--config aug.yaml \
--episode 0
Synchronized Views¶
When sync_views: true, all cameras receive identical augmentation parameters:
pipeline:
sync_views: true # Same crop, color, etc. across cameras
steps:
- name: random_crop_pad
keep_ratio_min: 0.85
Benefits:
- Maintains spatial relationships between cameras
- Preserves multi-view geometry for stereo/3D tasks
- Consistent lighting changes across viewpoints
Configuration Examples¶
Gentle Augmentation (Recommended)¶
For initial training and fine-tuning:
version: 1
pipeline:
sync_views: true
steps:
- name: random_crop_pad
keep_ratio_min: 0.95 # Minimal cropping
- name: color_jitter
magnitude: 0.05 # Subtle color changes
Domain Randomization (Advanced)¶
For maximum realism and robustness:
version: 1
pipeline:
sync_views: true
steps:
# Standard transforms
- name: random_crop_pad
keep_ratio_min: 0.88
- name: color_jitter
magnitude: 0.15
# Domain randomization
- name: lighting_rand
p: 0.6
ambient_tint: [0.95, 1.05]
directional_intensity: [0.0, 0.4]
preserve_robot_color: true
- name: camera_intrinsics_jitter
p: 0.5
fx_jitter: [0.98, 1.02]
cx_jitter_px: [-4, 4]
update_intrinsics: true
- name: rgb_sensor_noise
p: 0.4
shot_k: [0.5, 1.5]
read_sigma: [0.002, 0.01]
iso_range: [100, 800]
background:
adapter: none
seed: 42
- name: cutout
holes: 1
size_range: [4, 8] # Small patches
background:
adapter: none
seed: 42
Moderate Augmentation (Balanced)¶
For standard training:
version: 1
pipeline:
sync_views: true
steps:
- name: random_crop_pad
keep_ratio_min: 0.88
- name: random_translate
px: 8
- name: color_jitter
magnitude: 0.15
- name: random_conv
kernel_variance: 0.035
- name: cutout
holes: 1
size_range: [8, 16]
background:
adapter: none
seed: 42
Aggressive Augmentation (High Diversity)¶
For robust training or limited data:
version: 1
pipeline:
sync_views: true
steps:
- name: random_crop_pad
keep_ratio_min: 0.75 # Significant cropping
- name: random_translate
px: 15 # Large shifts
- name: color_jitter
magnitude: 0.30 # Strong color changes
- name: random_conv
kernel_variance: 0.08 # Noticeable texture
- name: cutout
holes: 2
size_range: [20, 40] # Large patches
background:
adapter: none
seed: 42
Backup and Restore¶
Automatic Backups¶
By default, original episodes are backed up:
# Creates backup automatically
dataphy augment dataset --dataset-path ./dataset --config aug.yaml --episode 0
# Skip backup (not recommended)
dataphy augment dataset --dataset-path ./dataset --config aug.yaml --episode 0 --no-backup
Restore Operations¶
# Restore specific episode
dataphy augment dataset --dataset-path ./dataset --restore episode_000000
# Check backup status
dataphy augment dataset --dataset-path ./dataset --list-episodes
Backup Structure:
dataset/
├── videos/ # Current (augmented) videos
├── data/ # Current data
└── backups/ # Backup directory
└── episode_000000/ # Backed up episode
├── videos/ # Original videos
└── data/ # Original data
Programmatic Usage¶
Python API¶
from dataphy.dataset.registry import create_dataset_loader, DatasetFormat
from dataphy.dataset.episode_augmentor import EpisodeAugmentor
# Setup
loader = create_dataset_loader("./dataset", DatasetFormat.LEROBOT)
augmentor = EpisodeAugmentor(loader)
# List episodes and cameras
episodes = augmentor.list_episodes()
cameras = augmentor.get_available_cameras("episode_000000")
# Augment episode
augmentor.augment_episode(
episode_id=0, # Can use index or name
config_file="aug.yaml",
camera_streams=["observation.images.webcam"],
preserve_original=True
)
# Restore if needed
augmentor.restore_episode("episode_000000")
Batch Processing¶
# Augment multiple episodes
episodes_to_augment = [0, 1, 2, 5, 10]
for episode_idx in episodes_to_augment:
print(f"Augmenting episode {episode_idx}")
augmentor.augment_episode(
episode_id=episode_idx,
config_file="aug.yaml",
preserve_original=True
)
Best Practices¶
Parameter Tuning¶
- Start conservative: Begin with gentle parameters
- Test visually: Use
dataphy dataset visualizeto check results - Preserve key features: Ensure robot/object visibility
- Consider task requirements: Navigation vs manipulation have different needs
Multi-Camera Considerations¶
- Always use
sync_views: truefor multi-camera setups - Test all camera angles to ensure quality
- Consider stereo geometry when using spatial transforms
- Validate depth perception if using depth cameras
Performance Tips¶
- Augment selectively: Not all episodes need augmentation
- Batch operations: Process multiple episodes in sequence
- Monitor disk space: Backups double storage requirements
- Use appropriate seeds: Different seeds for different augmentation passes
Robotics-Specific Guidelines¶
- Preserve spatial relationships: Keep robot-object relative positions
- Maintain visual landmarks: Don't occlude important reference points
- Consider sensor noise: Add realistic noise, not artificial artifacts
- Test with your model: Validate that augmented data improves performance
Troubleshooting¶
Common Issues¶
Episode not found:
Camera not found:
# List available cameras
dataphy augment dataset --dataset-path ./dataset --list-episodes
# Shows cameras for each episode
Config errors:
Out of disk space:
# Check backup sizes
du -sh dataset/backups/
# Remove old backups if needed
rm -rf dataset/backups/episode_000000
Next Steps¶
- API Reference: Explore all augmentation functions
- Examples: See complete augmentation workflows
- Experiment: Try different parameter combinations for your specific use case