Quick Start Guide¶
This guide covers the OWA workflow: Record → Process → Train
Training Pipeline Coming Soon
We developed a complete training pipeline during our D2E research. We're currently preparing it for open-source release—stay tuned!
# 1. Record desktop interaction
$ ocap my-session.mcap
# 2. Process to training format
$ python scripts/01_raw_to_event.py --train-dir ./
# 3. Train your model (coming soon)
$ python train.py --dataset ./event-dataset
Prerequisites¶
Before starting, install OWA. See the Installation Guide for details.
Step 1: Record Desktop Interaction¶
ocap records your desktop in one command:
This captures screen video (H.265), keyboard/mouse events, window context, and audio—all synchronized with nanosecond precision. See ocap documentation for options.
Here's a demo of ocap in action:
Step 2: Process to Training Format¶
Transform recorded data into training-ready datasets:
See owa-data for full pipeline documentation.
Step 3: Train Your Model¶
Training Pipeline Coming Soon
We developed a complete training pipeline during our D2E research. We're currently preparing it for open-source release—stay tuned!
Environment Framework¶
For live agent interactions (not just recording), use OWA's environment framework:
See Environment Guide for the full API.
Next Steps¶
| Goal | Resource |
|---|---|
| Browse community data | 🤗 HuggingFace Datasets |
| Visualize recordings | Dataset Visualizer |
| Build agents | Agent Examples |
| Extend OWA | Custom Plugins |
| Get help | FAQ · Contributing |