Skip to content

ocapยถ

ocap gstreamer-bundle

High-performance desktop recorder for Windows. Captures screen, audio, keyboard, mouse, and window events.

What is ocap?ยถ

ocap (Omnimodal CAPture) captures all essential desktop signals in synchronized format. Records screen video, audio, keyboard/mouse input, and window events. Built for the open-world-agents project but works for any desktop recording needs.

TL;DR: Complete, high-performance desktop recording tool for Windows. Captures everything in one command.

Key Featuresยถ

  • Complete desktop recording: Video, audio, keyboard/mouse events, window events
  • High performance: Hardware-accelerated with Windows APIs and GStreamer
  • Efficient encoding: H265/HEVC for high quality and small file size
  • Simple operation: ocap FILE_LOCATION (stop with Ctrl+C)
  • Clean architecture: Core logic in single 320-line Python file
  • Modern formats: MKV with embedded timestamps, MCAP format for events

System Requirementsยถ

Based on OBS Studio recommended specs + NVIDIA GPU requirements:

Component Specification
OS Windows 11 (64-bit)
Processor Intel i7 8700K / AMD Ryzen 1600X
Memory 8 GB RAM
Graphics NVIDIA GeForce 10 Series or newer โš ๏ธ
DirectX Version 11
Storage 600 MB + ~100MB per minute recording

โš ๏ธ NVIDIA GPU Required: Currently only supports NVIDIA GPUs for hardware acceleration. AMD/Intel GPU support possible through GStreamer framework - contributions welcome!

๐Ÿ–ฅ๏ธ OS Support: Currently only supports Windows. However, support for other operating systems (Linux, macOS) can be relatively easily extended due to the presence of GStreamer. Simply using different GStreamer pipelines can enable capture on other platforms - contributions welcome!

Installation & Usageยถ

Option 1: Download Releaseยถ

  1. Download ocap.zip from releases
  2. Unzip and run:
    • Double-click run.bat (opens terminal with virtual environment)
    • Or in CLI: run.bat --help

Option 2: Package Installยถ

All OWA packages are available on PyPI:

# Install GStreamer dependencies first (for video recording)
$ conda install open-world-agents::gstreamer-bundle

# Install ocap
$ pip install ocap

Basic Usageยถ

# Start recording (stop with Ctrl+C)
$ ocap my-recording

# Show all options
$ ocap --help

# Advanced options
$ ocap FILENAME --window-name "App"   # Record specific window
$ ocap FILENAME --monitor-idx 1       # Record specific monitor
$ ocap FILENAME --fps 60              # Set framerate
$ ocap FILENAME --no-record-audio     # Disable audio

Output Filesยถ

  • .mcap โ€” Event log (keyboard, mouse, windows)
  • .mkv โ€” Video/audio with embedded timestamps

Your recording files will be ready immediately!

Feature Comparisonยถ

Feature ocap OBS wcap pillow/mss
Advanced data formats (MCAP/MKV) โœ… Yes โŒ No โŒ No โŒ No
Timestamp aligned logging โœ… Yes โŒ No โŒ No โŒ No
Customizable event definition & Listener โœ… Yes โŒ No โŒ No โŒ No
Single python file โœ… Yes โŒ No โŒ No โŒ No
Audio + Window + Keyboard + Mouse โœ… Yes โš ๏ธ Partial โŒ No โŒ No
Hardware-accelerated encoder โœ… Yes โœ… Yes โœ… Yes โŒ No
Supports latest Windows APIs โœ… Yes โœ… Yes โœ… Yes โŒ No (legacy APIs only)
Optional mouse cursor capture โœ… Yes โœ… Yes โœ… Yes โŒ No

Technical Architectureยถ

Built on GStreamer with clean, maintainable design:

flowchart TD
    %% Input Sources
    A[owa.env.desktop] --> B[Keyboard Events]
    A --> C[Mouse Events] 
    A --> D[Window Events]
    E[owa.env.gst] --> F[Screen Capture]
    E --> G[Audio Capture]

    %% Core Processing
    B --> H[Event Queue]
    C --> H
    D --> H
    F --> H
    F --> I[Video/Audio Pipeline]
    G --> I

    %% Outputs
    H --> J[MCAP Writer]
    I --> K[MKV Pipeline]

    %% Files
    J --> L[๐Ÿ“„ events.mcap]
    K --> M[๐ŸŽฅ video.mkv]

    style A fill:#e1f5fe
    style E fill:#e1f5fe
    style H fill:#fff3e0
    style L fill:#e8f5e8
    style M fill:#e8f5e8

Troubleshootingยถ

  • Record terminates right after start? Re-run the same command a few times. This is due to an intermittent GStreamer crash with an unknown cause.
  • GStreamer error message box appears on first run? This is a known issue where GStreamer may show error dialogs the first time you run ocap. These messages do not affect recordingโ€”simply close the dialogs and continue. ocap will function normally.
  • Audio not recording? By default, only audio from the target process is recorded. To change this, manually edit the GStreamer pipeline.
  • Large file sizes? Reduce file size by adjusting the gop-size parameter in the nvd3d11h265enc element. See pipeline.py.
  • Performance tips: Close unnecessary applications before recording, use SSD storage for better write performance, and record to a different drive than your OS drive.

FAQยถ

  • How much disk space do recordings use? ~100MB per minute for 1080p H265 recording.
  • Can I customize recorded events? Yes. Enable/disable audio, keyboard, mouse, and window events individually. Since recorder.py is just a 320-line single python script, you may customize it easily.
  • Will ocap slow down my computer? Minimal impact with hardware acceleration. Designed for low overhead.
  • What formats are supported? MKV with H265/HEVC encoding for video and MCAP format for events for efficient storage and querying is supported, but you may customize it easily. (e.g. saving jsonl instead of mcap file takes minimal effort by editing recorder.py)

When to Use ocapยถ

  • Agent training: Capture all inputs and outputs for AI training
  • Workflow documentation: Record exact steps with precise timing
  • Performance testing: Low-overhead recording during intensive tasks
  • Complete screen recording: When you need more than just video