Skip to content

Desktop Environment

Mouse, keyboard, window control, and screen capture for desktop automation.

Installation

pip install owa-env-desktop

Components

Category Component Type Description
Mouse desktop/mouse.click Callable Simulate mouse clicks
desktop/mouse.move Callable Move cursor to coordinates
desktop/mouse.position Callable Get current mouse position
desktop/mouse.press Callable Press mouse button
desktop/mouse.release Callable Release mouse button
desktop/mouse.scroll Callable Simulate mouse wheel scrolling
desktop/mouse.get_state Callable Get current mouse position and buttons
desktop/mouse.get_pointer_ballistics_config Callable Get Windows pointer ballistics settings
desktop/mouse Listener Monitor mouse events
desktop/mouse_state Listener Monitor mouse state changes
desktop/raw_mouse Listener Raw mouse input (bypasses acceleration)
Keyboard desktop/keyboard.press Callable Press/release keys
desktop/keyboard.type Callable Type text strings
desktop/keyboard.press_repeat Callable Simulate key auto-repeat
desktop/keyboard.get_keyboard_repeat_timing Callable Get Windows keyboard repeat timing
desktop/keyboard Listener Monitor keyboard events
desktop/keyboard_state Listener Monitor keyboard state changes
Screen desktop/screen.capture Callable Capture screen (basic)
Window desktop/window.get_active_window Callable Get active window info
desktop/window.get_window_by_title Callable Find window by title
desktop/window.get_pid_by_title Callable Get process ID by window title
desktop/window.when_active Callable Wait until window becomes active
desktop/window.is_active Callable Check if window is active
desktop/window.make_active Callable Activate/focus window
desktop/window Listener Monitor window events

Performance Note

For high-performance screen capture, use GStreamer Environment instead (6x faster).

Usage Examples

from owa.core import CALLABLES

# Click and move
CALLABLES["desktop/mouse.click"]("left", 2)  # Double-click
CALLABLES["desktop/mouse.move"](100, 200)

# Get position
x, y = CALLABLES["desktop/mouse.position"]()
print(f"Mouse at: {x}, {y}")
from owa.core import CALLABLES

# Type text
CALLABLES["desktop/keyboard.type"]("Hello World!")

# Press keys
CALLABLES["desktop/keyboard.press"]("ctrl+c")

# Auto-repeat (hold key)
CALLABLES["desktop/keyboard.press_repeat"]("space", press_time=2.0)
from owa.core import LISTENERS
from owa.msgs.desktop.keyboard import KeyboardEvent

def on_key(event: KeyboardEvent):
    print(f"Key {event.event_type}: {event.vk}")

def on_mouse(event):
    print(f"Mouse: {event.event_type} at {event.x}, {event.y}")

# Monitor events
with LISTENERS["desktop/keyboard"]().configure(callback=on_key).session:
    with LISTENERS["desktop/mouse"]().configure(callback=on_mouse).session:
        input("Press Enter to stop monitoring...")
from owa.core import CALLABLES

# Get window information
active = CALLABLES["desktop/window.get_active_window"]()
print(f"Active window: {active}")

# Find specific window
window = CALLABLES["desktop/window.get_window_by_title"]("Notepad")
if window:
    print(f"Found Notepad: {window}")

Technical Details

Library Selection Rationale

This module utilizes pynput for input simulation after evaluating several alternatives:

  • Why not PyAutoGUI? Though widely used, PyAutoGUI uses deprecated Windows APIs (keybd_event/mouse_event) rather than the modern SendInput method. These older APIs fail in DirectX applications and games. Additionally, PyAutoGUI has seen limited maintenance (last significant update was over 2 years ago).

  • Alternative Solutions: Libraries like pydirectinput and pydirectinput_rgx address the Windows API issue by using SendInput, but they lack input capturing capabilities which are essential for our use case.

  • Other Options: We also evaluated keyboard and mouse libraries but found them inadequately maintained with several unresolved bugs that could impact reliability.

Raw Mouse Input

Raw mouse input capture is available to separate mouse position movement from game's center-locking and from user interactions. This enables access to unfiltered mouse movement data directly from the hardware, bypassing Windows pointer acceleration and game cursor manipulation.

Key Auto-Repeat Functionality

Key auto-repeat is a Windows feature where holding down a key generates multiple key events after an initial delay. When a user presses and holds a key, Windows first waits for the repeat delay period, then generates repeated WM_KEYDOWN messages at intervals determined by the repeat rate.

How Windows Auto-Repeat Works

  1. Initial Key Press: First WM_KEYDOWN message is sent immediately with repeat count = 1
  2. Repeat Delay: System waits for the configured delay (typically 250-1000ms)
  3. Repeated Events: Additional WM_KEYDOWN messages are sent at the repeat rate interval (typically 30ms)
  4. Repeat Count: Each repeated message includes an incremented repeat count in the message parameters

System Configuration: Windows allows users to configure auto-repeat behavior through: - Repeat Delay: Time before auto-repeat begins (0-3 scale, maps to 250ms-1000ms, default: 500ms) - Repeat Rate: Frequency of repeated characters (0-31 scale, maps to ~30ms-500ms intervals, default: 30ms)

These settings can be accessed programmatically via SystemParametersInfo with SPI_GETKEYBOARDDELAY and SPI_GETKEYBOARDSPEED parameters.

References: - Keyboard Repeat Delay and Repeat Rate - Microsoft documentation on keyboard repeat behavior - SystemParametersInfo Function - Windows API for keyboard repeat parameters

Using OWA's press_repeat Function

For simulating key auto-repeat behavior, use the dedicated function:

CALLABLES["desktop/keyboard.press_repeat"](key, press_time: float, initial_delay: float = 0.5, repeat_delay: float = 0.033)

Parameters: - key: The key to press and repeat - press_time: Total duration to hold the key (seconds) - initial_delay: Time before repeating starts (default: 0.5s, matches Windows default) - repeat_delay: Interval between repeated keypresses (default: 0.033s ≈ 30ms, matches Windows default)

Differences from True Windows Auto-Repeat

The press_repeat function approximates Windows auto-repeat behavior but isn't identical:

OS Auto-Repeat vs OWA Implementation: - OS Auto-Repeat: WM_KEYDOWN messages include repeat flag (bit 30) and repeat count - OWA Implementation: Multiple WM_KEYDOWN messages without repeat flags (each appears as individual key press)

The difference is small and commonly ignored by applications, making this approach effective for most automation scenarios.

Why the difference exists: Windows provides repeat detection through WM_KEYDOWN message parameters, but pynput does not expose these Windows-specific details. Since the primary use case is triggering repeat behavior rather than detecting it, this limitation doesn't affect the functionality.

Reference: WM_KEYDOWN Message - Official Windows documentation for key press events and message parameters

Technical Details: Windows Repeat Count Behavior

The WM_KEYDOWN repeat count (bits 0-15) behaves differently than many developers expect:

  • Not cumulative: Each message contains the repeat count since the last processed WM_KEYDOWN, not a running total
  • Usually 1: In typical applications with fast message processing, the repeat count is almost always 1
  • Higher values possible: Only occurs when message processing is slow enough for multiple repeats to queue up

Example: If you hold a key and your message loop processes messages quickly, you'll receive multiple WM_KEYDOWN messages each with repeat count = 1. Only when processing is delayed (e.g., by adding Sleep(1000) in the handler) will you see higher repeat counts like 20-30.

This design allows responsive applications to process key events immediately rather than waiting for the key release.

Reference: WM_KEYDOWN repeat count behavior explained - Stack Overflow discussion with practical examples

Implementation

See owa-env-desktop source for detailed implementation.

API Reference

desktop plugin 0.5.7

Desktop environment plugin with mouse, keyboard, and window control

Author: OWA Development Team

Callables

Usage: To use callable components, import CALLABLES from owa.core and access them by their component name:

from owa.core import CALLABLES

# Access a callable component (replace 'component_name' with actual name)
callable_func = CALLABLES["desktop/component_name"]
result = callable_func(your_arguments)

screen.capture

capture_screen() -> ndarray

Capture the current screen as a numpy array.

Returns:

Type Description
ndarray

numpy.ndarray: Screen capture as BGR image array with shape (height, width, 3).

Examples:

>>> screen = capture_screen()
>>> print(f"Screen dimensions: {screen.shape}")  # e.g., (1080, 1920, 3)
>>> # Save to file: cv2.imwrite('screenshot.png', screen)
Source code in projects/owa-env-desktop/owa/env/desktop/screen/callables.py
def capture_screen() -> np.ndarray:
    """
    Capture the current screen as a numpy array.

    Returns:
        numpy.ndarray: Screen capture as BGR image array with shape (height, width, 3).

    Examples:
        >>> screen = capture_screen()
        >>> print(f"Screen dimensions: {screen.shape}")  # e.g., (1080, 1920, 3)
        >>> # Save to file: cv2.imwrite('screenshot.png', screen)
    """
    import bettercam

    camera = bettercam.create()
    return camera.grab()

mouse.click

click(button: str | Button, count: int) -> None

Simulate a mouse click.

Parameters:

Name Type Description Default
button str | Button

Mouse button to click. Can be "left", "middle", "right" or a Button enum.

required
count int

Number of clicks to perform.

required

Examples:

>>> click("left", 1)  # Single left click
>>> click("right", 2)  # Double right click
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def click(button: str | Button, count: int) -> None:
    """
    Simulate a mouse click.

    Args:
        button: Mouse button to click. Can be "left", "middle", "right" or a Button enum.
        count: Number of clicks to perform.

    Examples:
        >>> click("left", 1)  # Single left click
        >>> click("right", 2)  # Double right click
    """
    if button in ("left", "middle", "right"):
        button = getattr(Button, button)
    return mouse_controller.click(button, count)

mouse.move

mouse_move(x: int, y: int) -> None

Move the mouse cursor to specified coordinates.

Parameters:

Name Type Description Default
x int

X coordinate to move to.

required
y int

Y coordinate to move to.

required

Examples:

>>> mouse_move(100, 200)  # Move mouse to position (100, 200)
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def mouse_move(x: int, y: int) -> None:
    """
    Move the mouse cursor to specified coordinates.

    Args:
        x: X coordinate to move to.
        y: Y coordinate to move to.

    Examples:
        >>> mouse_move(100, 200)  # Move mouse to position (100, 200)
    """
    return mouse_controller.move(x, y)

mouse.position

mouse_position() -> tuple[int, int]

Get the current mouse cursor position.

Returns:

Type Description
tuple[int, int]

Tuple of (x, y) coordinates of the mouse cursor.

Examples:

>>> x, y = mouse_position()
>>> print(f"Mouse is at ({x}, {y})")
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def mouse_position() -> tuple[int, int]:
    """
    Get the current mouse cursor position.

    Returns:
        Tuple of (x, y) coordinates of the mouse cursor.

    Examples:
        >>> x, y = mouse_position()
        >>> print(f"Mouse is at ({x}, {y})")
    """
    return mouse_controller.position

mouse.press

mouse_press(button: str | Button) -> None

Press and hold a mouse button.

Parameters:

Name Type Description Default
button str | Button

Mouse button to press. Can be "left", "middle", "right" or a Button enum.

required

Examples:

>>> mouse_press("left")  # Press and hold left mouse button
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def mouse_press(button: str | Button) -> None:
    """
    Press and hold a mouse button.

    Args:
        button: Mouse button to press. Can be "left", "middle", "right" or a Button enum.

    Examples:
        >>> mouse_press("left")  # Press and hold left mouse button
    """
    return mouse_controller.press(button)

mouse.release

mouse_release(button: str | Button) -> None

Release a previously pressed mouse button.

Parameters:

Name Type Description Default
button str | Button

Mouse button to release. Can be "left", "middle", "right" or a Button enum.

required

Examples:

>>> mouse_release("left")  # Release left mouse button
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def mouse_release(button: str | Button) -> None:
    """
    Release a previously pressed mouse button.

    Args:
        button: Mouse button to release. Can be "left", "middle", "right" or a Button enum.

    Examples:
        >>> mouse_release("left")  # Release left mouse button
    """
    return mouse_controller.release(button)

mouse.scroll

mouse_scroll(x: int, y: int, dx: int, dy: int) -> None

Simulate mouse wheel scrolling.

Parameters:

Name Type Description Default
x int

X coordinate where scrolling occurs.

required
y int

Y coordinate where scrolling occurs.

required
dx int

Horizontal scroll amount.

required
dy int

Vertical scroll amount.

required

Examples:

>>> mouse_scroll(100, 100, 0, 3)  # Scroll up 3 units at position (100, 100)
>>> mouse_scroll(100, 100, 0, -3)  # Scroll down 3 units
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def mouse_scroll(x: int, y: int, dx: int, dy: int) -> None:
    """
    Simulate mouse wheel scrolling.

    Args:
        x: X coordinate where scrolling occurs.
        y: Y coordinate where scrolling occurs.
        dx: Horizontal scroll amount.
        dy: Vertical scroll amount.

    Examples:
        >>> mouse_scroll(100, 100, 0, 3)  # Scroll up 3 units at position (100, 100)
        >>> mouse_scroll(100, 100, 0, -3)  # Scroll down 3 units
    """
    return mouse_controller.scroll(x, y, dx, dy)

mouse.get_state

get_mouse_state() -> MouseState

Get the current mouse state including position and pressed buttons.

Returns:

Type Description
MouseState

MouseState object containing current mouse position and pressed buttons.

Examples:

>>> state = get_mouse_state()
>>> print(f"Mouse at ({state.x}, {state.y}), buttons: {state.buttons}")
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def get_mouse_state() -> MouseState:
    """
    Get the current mouse state including position and pressed buttons.

    Returns:
        MouseState object containing current mouse position and pressed buttons.

    Examples:
        >>> state = get_mouse_state()
        >>> print(f"Mouse at ({state.x}, {state.y}), buttons: {state.buttons}")
    """
    position = mouse_controller.position
    if position is None:
        position = (-1, -1)  # Fallback if position cannot be retrieved
    mouse_buttons = set()
    buttons = get_vk_state()
    for button, vk in {"left": 1, "right": 2, "middle": 4}.items():
        if vk in buttons:
            mouse_buttons.add(button)
    return MouseState(x=position[0], y=position[1], buttons=mouse_buttons)

mouse.get_pointer_ballistics_config

get_pointer_ballistics_config() -> PointerBallisticsConfig

Get Windows pointer ballistics configuration for WM_MOUSEMOVE reconstruction.

Examples:

Check whether Enhance pointer precision is enabled
>>> is_mouse_acceleration_enabled = get_pointer_ballistics_config().mouse_speed
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def get_pointer_ballistics_config() -> PointerBallisticsConfig:
    """Get Windows pointer ballistics configuration for WM_MOUSEMOVE reconstruction.

    Examples:
        # Check whether Enhance pointer precision is enabled
        >>> is_mouse_acceleration_enabled = get_pointer_ballistics_config().mouse_speed
    """
    if sys.platform != "win32":
        return PointerBallisticsConfig()  # Return default values

    try:
        return PointerBallisticsConfig(**_get_mouse_registry_values())
    except Exception:
        return PointerBallisticsConfig()  # Return default values

keyboard.press

press(key: str | int) -> None

Press and hold a keyboard key.

Parameters:

Name Type Description Default
key str | int

Key to press. Can be a string (e.g., 'a', 'enter') or virtual key code.

required

Examples:

>>> press('a')  # Press and hold the 'a' key
>>> press(65)  # Press and hold the 'a' key using virtual key code
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def press(key: str | int) -> None:
    """
    Press and hold a keyboard key.

    Args:
        key: Key to press. Can be a string (e.g., 'a', 'enter') or virtual key code.

    Examples:
        >>> press('a')  # Press and hold the 'a' key
        >>> press(65)  # Press and hold the 'a' key using virtual key code
    """
    key = vk_to_keycode(key) if isinstance(key, int) else key
    return keyboard_controller.press(key)

keyboard.release

release(key: str | int) -> None

Release a previously pressed keyboard key.

Parameters:

Name Type Description Default
key str | int

Key to release. Can be a string (e.g., 'a', 'enter') or virtual key code.

required

Examples:

>>> release('a')  # Release the 'a' key
>>> release(65)  # Release the 'a' key using virtual key code
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def release(key: str | int) -> None:
    """
    Release a previously pressed keyboard key.

    Args:
        key: Key to release. Can be a string (e.g., 'a', 'enter') or virtual key code.

    Examples:
        >>> release('a')  # Release the 'a' key
        >>> release(65)  # Release the 'a' key using virtual key code
    """
    key = vk_to_keycode(key) if isinstance(key, int) else key
    return keyboard_controller.release(key)

keyboard.type

keyboard_type(text: str) -> None

Type a string of characters.

Parameters:

Name Type Description Default
text str

Text string to type.

required

Examples:

>>> keyboard_type("Hello, World!")  # Types the text
>>> keyboard_type("user@example.com")  # Types an email address
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def keyboard_type(text: str) -> None:
    """
    Type a string of characters.

    Args:
        text: Text string to type.

    Examples:
        >>> keyboard_type("Hello, World!")  # Types the text
        >>> keyboard_type("user@example.com")  # Types an email address
    """
    return keyboard_controller.type(text)

keyboard.get_state

get_keyboard_state() -> KeyboardState

Get the current keyboard state including pressed keys.

Returns:

Type Description
KeyboardState

KeyboardState object containing currently pressed keys.

Examples:

>>> state = get_keyboard_state()
>>> print(f"Pressed keys: {state.buttons}")
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def get_keyboard_state() -> KeyboardState:
    """
    Get the current keyboard state including pressed keys.

    Returns:
        KeyboardState object containing currently pressed keys.

    Examples:
        >>> state = get_keyboard_state()
        >>> print(f"Pressed keys: {state.buttons}")
    """
    return KeyboardState(buttons=get_vk_state())

keyboard.press_repeat

press_repeat_key(
    key: str | int,
    press_time: float,
    initial_delay: float = 0.5,
    repeat_delay: float = 0.033,
) -> None

Simulate the behavior of holding a key down with auto-repeat.

Parameters:

Name Type Description Default
key str | int

Key to press repeatedly. Can be a string or virtual key code.

required
press_time float

Total time to hold the key down in seconds.

required
initial_delay float

Initial delay before auto-repeat starts (default: 0.5s).

0.5
repeat_delay float

Delay between repeated key presses (default: 0.033s).

0.033

Examples:

>>> press_repeat_key('a', 2.0)  # Hold 'a' key for 2 seconds with auto-repeat
>>> press_repeat_key('space', 1.5, 0.3, 0.05)  # Custom timing for space key
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def press_repeat_key(
    key: str | int, press_time: float, initial_delay: float = 0.5, repeat_delay: float = 0.033
) -> None:
    """
    Simulate the behavior of holding a key down with auto-repeat.

    Args:
        key: Key to press repeatedly. Can be a string or virtual key code.
        press_time: Total time to hold the key down in seconds.
        initial_delay: Initial delay before auto-repeat starts (default: 0.5s).
        repeat_delay: Delay between repeated key presses (default: 0.033s).

    Examples:
        >>> press_repeat_key('a', 2.0)  # Hold 'a' key for 2 seconds with auto-repeat
        >>> press_repeat_key('space', 1.5, 0.3, 0.05)  # Custom timing for space key
    """
    key = vk_to_keycode(key) if isinstance(key, int) else key
    repeat_time = max(0, (press_time - initial_delay) // repeat_delay - 1)

    keyboard_controller.press(key)
    time.sleep(initial_delay)
    for _ in range(int(repeat_time)):
        keyboard_controller.press(key)
        time.sleep(repeat_delay)
    keyboard_controller.release(key)

keyboard.release_all_keys

release_all_keys() -> None

Release all currently pressed keys on the keyboard.

Examples:

>>> release_all_keys()  # Release all pressed keys
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def release_all_keys() -> None:
    """
    Release all currently pressed keys on the keyboard.

    Examples:
        >>> release_all_keys()  # Release all pressed keys
    """
    keyboard_state: KeyboardState = get_keyboard_state()
    for key in keyboard_state.buttons:
        release(key)

keyboard.get_keyboard_repeat_timing

get_keyboard_repeat_timing(
    *, return_seconds: Literal[True] = True
) -> Dict[str, float]
get_keyboard_repeat_timing(
    *, return_seconds: Literal[False]
) -> Dict[str, int]
get_keyboard_repeat_timing(
    *, return_seconds: bool = True
) -> Dict[str, float] | Dict[str, int]

Get Windows keyboard repeat delay and repeat rate settings.

Parameters:

Name Type Description Default
return_seconds bool

If True (default), return timing values in seconds. If False, return raw Windows API values.

True

Returns:

Type Description
Dict[str, float] | Dict[str, int]

When return_seconds=True: Dict[str, float]: Dictionary with timing in seconds - keyboard_delay_seconds: Initial delay before auto-repeat starts - keyboard_rate_seconds: Interval between repeated keystrokes

Dict[str, float] | Dict[str, int]

When return_seconds=False: Dict[str, int]: Dictionary with raw Windows API values - keyboard_delay: Raw delay value (0-3 scale) - keyboard_speed: Raw speed value (0-31 scale)

Raises:

Type Description
OSError

If not running on Windows platform

RuntimeError

If Windows API call fails

Examples:

>>> # Get timing in seconds (default)
>>> timing = get_keyboard_repeat_timing()
>>> print(f"Delay: {timing['keyboard_delay_seconds']:.3f}s, Rate: {timing['keyboard_rate_seconds']:.3f}s")
>>> # Get raw Windows API values
>>> raw_timing = get_keyboard_repeat_timing(return_seconds=False)
>>> print(f"Raw delay: {raw_timing['keyboard_delay']}, Raw speed: {raw_timing['keyboard_speed']}")
Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py
def get_keyboard_repeat_timing(*, return_seconds: bool = True) -> Dict[str, float] | Dict[str, int]:
    """
    Get Windows keyboard repeat delay and repeat rate settings.

    Args:
        return_seconds: If True (default), return timing values in seconds.
                       If False, return raw Windows API values.

    Returns:
        When return_seconds=True:
            Dict[str, float]: Dictionary with timing in seconds
                - keyboard_delay_seconds: Initial delay before auto-repeat starts
                - keyboard_rate_seconds: Interval between repeated keystrokes

        When return_seconds=False:
            Dict[str, int]: Dictionary with raw Windows API values
                - keyboard_delay: Raw delay value (0-3 scale)
                - keyboard_speed: Raw speed value (0-31 scale)

    Raises:
        OSError: If not running on Windows platform
        RuntimeError: If Windows API call fails

    Examples:
        >>> # Get timing in seconds (default)
        >>> timing = get_keyboard_repeat_timing()
        >>> print(f"Delay: {timing['keyboard_delay_seconds']:.3f}s, Rate: {timing['keyboard_rate_seconds']:.3f}s")

        >>> # Get raw Windows API values
        >>> raw_timing = get_keyboard_repeat_timing(return_seconds=False)
        >>> print(f"Raw delay: {raw_timing['keyboard_delay']}, Raw speed: {raw_timing['keyboard_speed']}")
    """
    if sys.platform != "win32":
        raise OSError("Keyboard repeat settings are only available on Windows")

    # Windows constants
    SPI_GETKEYBOARDDELAY = 0x0016
    SPI_GETKEYBOARDSPEED = 0x000A

    # Get keyboard delay (0-3 scale)
    keyboard_delay = wintypes.UINT(0)
    if not ctypes.windll.user32.SystemParametersInfoW(SPI_GETKEYBOARDDELAY, 0, ctypes.byref(keyboard_delay), 0):
        raise RuntimeError("Failed to get keyboard delay setting from Windows API")

    # Get keyboard speed (0-31 scale)
    keyboard_speed = wintypes.UINT(0)
    if not ctypes.windll.user32.SystemParametersInfoW(SPI_GETKEYBOARDSPEED, 0, ctypes.byref(keyboard_speed), 0):
        raise RuntimeError("Failed to get keyboard speed setting from Windows API")

    # Convert to actual time values based on Microsoft documentation
    # References:
    # - KeyboardDelay: https://learn.microsoft.com/en-us/dotnet/api/system.windows.forms.systeminformation.keyboarddelay
    # - KeyboardSpeed: https://learn.microsoft.com/en-us/dotnet/api/system.windows.forms.systeminformation.keyboardspeed

    # Delay: 0=250ms, 1=500ms, 2=750ms, 3=1000ms (approximately)
    keyboard_delay_seconds = 0.25 + (keyboard_delay.value * 0.25)

    # Speed: 0=~2.5 repetitions/sec, 31=~30 repetitions/sec (from Microsoft docs)
    # Linear interpolation formula (derived): repetitions_per_sec = 2.5 + (speed_value * 27.5 / 31)
    # Where 27.5 = (30 - 2.5) is the range between max and min repetitions per second
    repetitions_per_sec = 2.5 + (keyboard_speed.value * 27.5 / 31)
    keyboard_rate_seconds = 1.0 / repetitions_per_sec

    if return_seconds:
        return {"keyboard_delay_seconds": keyboard_delay_seconds, "keyboard_rate_seconds": keyboard_rate_seconds}
    else:
        return {"keyboard_delay": keyboard_delay.value, "keyboard_speed": keyboard_speed.value}

window.get_active_window

get_active_window() -> WindowInfo | None

Get information about the currently active window.

Returns:

Type Description
WindowInfo | None

WindowInfo object containing title, position, and handle of the active window,

WindowInfo | None

or None if no active window is found.

Examples:

>>> window = get_active_window()
>>> if window:
...     print(f"Active window: {window.title}")
...     print(f"Position: {window.rect}")
Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py
def get_active_window() -> WindowInfo | None:
    """
    Get information about the currently active window.

    Returns:
        WindowInfo object containing title, position, and handle of the active window,
        or None if no active window is found.

    Examples:
        >>> window = get_active_window()
        >>> if window:
        ...     print(f"Active window: {window.title}")
        ...     print(f"Position: {window.rect}")
    """
    if _IS_DARWIN:
        from Quartz import (
            CGWindowListCopyWindowInfo,
            kCGNullWindowID,
            kCGWindowListOptionOnScreenOnly,
        )

        windows = CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID)
        for window in windows:
            if window.get("kCGWindowLayer", 0) == 0:  # Frontmost window
                bounds = window.get("kCGWindowBounds")
                title = window.get("kCGWindowName", "")
                rect = (
                    int(bounds["X"]),
                    int(bounds["Y"]),
                    int(bounds["X"] + bounds["Width"]),
                    int(bounds["Y"] + bounds["Height"]),
                )
                hWnd = window.get("kCGWindowNumber", 0)
                return WindowInfo(title=title, rect=rect, hWnd=hWnd)
        return None

    elif _IS_WINDOWS:
        import pygetwindow as gw

        active_window = gw.getActiveWindow()
        if active_window is not None:
            rect = active_window._getWindowRect()
            title = active_window.title
            rect_coords = (rect.left, rect.top, rect.right, rect.bottom)
            hWnd = active_window._hWnd
            return WindowInfo(title=title, rect=rect_coords, hWnd=hWnd)
        return WindowInfo(title="", rect=[0, 0, 0, 0], hWnd=-1)
    else:
        raise NotImplementedError(f"Platform {_PLATFORM} is not supported yet")

window.get_window_by_title

get_window_by_title(
    window_title_substring: str,
) -> WindowInfo

Find a window by searching for a substring in its title.

Parameters:

Name Type Description Default
window_title_substring str

Substring to search for in window titles.

required

Returns:

Type Description
WindowInfo

WindowInfo object for the first matching window.

Raises:

Type Description
ValueError

If no window with matching title is found.

Examples:

>>> window = get_window_by_title("notepad")
>>> print(f"Found window: {window.title}")
Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py
def get_window_by_title(window_title_substring: str) -> WindowInfo:
    """
    Find a window by searching for a substring in its title.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        WindowInfo object for the first matching window.

    Raises:
        ValueError: If no window with matching title is found.

    Examples:
        >>> window = get_window_by_title("notepad")
        >>> print(f"Found window: {window.title}")
    """
    if _IS_WINDOWS:
        import pygetwindow as gw

        windows = gw.getWindowsWithTitle(window_title_substring)
        if not windows:
            raise ValueError(f"No window with title containing '{window_title_substring}' found.")

        # Temporal workaround to deal with `cmd`'s behavior: it setup own title as the command it running.
        # e.g. `owl window find abcd` will always find `cmd` window itself running command.
        if "Conda" in windows[0].title:
            windows.pop(0)

        window = windows[0]  # NOTE: only return the first window matching the title
        rect = window._getWindowRect()
        return WindowInfo(
            title=window.title,
            rect=(rect.left, rect.top, rect.right, rect.bottom),
            hWnd=window._hWnd,
        )

    elif _IS_DARWIN:
        from Quartz import CGWindowListCopyWindowInfo, kCGNullWindowID, kCGWindowLayer, kCGWindowListOptionOnScreenOnly

        windows = CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID)
        for window in windows:
            # Skip windows that are not on normal level (like menu bars, etc)
            if window.get(kCGWindowLayer, 0) != 0:
                continue

            # Get window name from either kCGWindowName or kCGWindowOwnerName
            title = window.get("kCGWindowName", "")
            if not title:
                title = window.get("kCGWindowOwnerName", "")

            if title and window_title_substring.lower() in title.lower():
                bounds = window.get("kCGWindowBounds")
                if bounds:
                    return WindowInfo(
                        title=title,
                        rect=(
                            int(bounds["X"]),
                            int(bounds["Y"]),
                            int(bounds["X"] + bounds["Width"]),
                            int(bounds["Y"] + bounds["Height"]),
                        ),
                        hWnd=window.get("kCGWindowNumber", 0),
                    )

        raise ValueError(f"No window with title containing '{window_title_substring}' found.")
    else:
        # Linux or other OS (not implemented yet)
        raise NotImplementedError("Not implemented for Linux or other OS.")

window.get_pid_by_title

get_pid_by_title(window_title_substring: str) -> int

Get the process ID (PID) of a window by its title.

Parameters:

Name Type Description Default
window_title_substring str

Substring to search for in window titles.

required

Returns:

Type Description
int

Process ID of the window.

Examples:

>>> pid = get_pid_by_title("notepad")
>>> print(f"Notepad PID: {pid}")
Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py
def get_pid_by_title(window_title_substring: str) -> int:
    """
    Get the process ID (PID) of a window by its title.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        Process ID of the window.

    Examples:
        >>> pid = get_pid_by_title("notepad")
        >>> print(f"Notepad PID: {pid}")
    """
    window = get_window_by_title(window_title_substring)
    if _IS_WINDOWS:
        import win32process

        # win32process.GetWindowThreadProcessId returns (tid, pid)
        _, pid = win32process.GetWindowThreadProcessId(window.hWnd)
        return pid
    else:
        # Implement if needed for other OS
        raise NotImplementedError(f"Getting PID by title not implemented for {_PLATFORM}")

window.when_active

when_active(window_title_substring: str) -> Callable

Decorator to run a function only when a specific window is active.

Parameters:

Name Type Description Default
window_title_substring str

Substring to search for in window titles.

required

Returns:

Type Description
Callable

Decorator function that conditionally executes the wrapped function.

Examples:

>>> @when_active("notepad")
... def do_something():
...     print("Notepad is active!")
Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py
def when_active(window_title_substring: str) -> Callable:
    """
    Decorator to run a function only when a specific window is active.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        Decorator function that conditionally executes the wrapped function.

    Examples:
        >>> @when_active("notepad")
        ... def do_something():
        ...     print("Notepad is active!")
    """

    def decorator(func):
        def wrapper(*args, **kwargs):
            if is_active(window_title_substring):
                return func(*args, **kwargs)

        return wrapper

    return decorator

window.is_active

is_active(window_title_substring: str) -> bool

Check if a window with the specified title substring is currently active.

Parameters:

Name Type Description Default
window_title_substring str

Substring to search for in window titles.

required

Returns:

Type Description
bool

True if the window is active, False otherwise.

Examples:

>>> if is_active("notepad"):
...     print("Notepad is the active window")
Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py
def is_active(window_title_substring: str) -> bool:
    """
    Check if a window with the specified title substring is currently active.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        True if the window is active, False otherwise.

    Examples:
        >>> if is_active("notepad"):
        ...     print("Notepad is the active window")
    """
    try:
        window = get_window_by_title(window_title_substring)
    except ValueError:
        return False
    active = get_active_window()
    return active is not None and active.hWnd == window.hWnd

window.make_active

make_active(window_title_substring: str) -> None

Bring a window to the foreground and make it active.

Parameters:

Name Type Description Default
window_title_substring str

Substring to search for in window titles.

required

Raises:

Type Description
ValueError

If no window with matching title is found.

NotImplementedError

If the operation is not supported on the current OS.

Examples:

>>> make_active("notepad")  # Brings notepad window to front
Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py
def make_active(window_title_substring: str) -> None:
    """
    Bring a window to the foreground and make it active.

    Args:
        window_title_substring: Substring to search for in window titles.

    Raises:
        ValueError: If no window with matching title is found.
        NotImplementedError: If the operation is not supported on the current OS.

    Examples:
        >>> make_active("notepad")  # Brings notepad window to front
    """

    os_name = platform.system()
    if os_name == "Windows":
        import pygetwindow as gw

        windows = gw.getWindowsWithTitle(window_title_substring)
        if not windows:
            raise ValueError(f"No window with title containing '{window_title_substring}' found.")

        # Temporal workaround to deal with `cmd`'s behavior: it setup own title as the command it running.
        # e.g. `owl window find abcd` will always find `cmd` window itself running command.
        if "Conda" in windows[0].title:
            windows.pop(0)

        window = windows[0]  # NOTE: only return the first window matching the title
        window.activate()
    else:
        raise NotImplementedError(f"Activation not implemented for this OS: {os_name}")

Listeners

Usage: To use listener components, import LISTENERS from owa.core and call the configure() method with a callback function:

from owa.core import LISTENERS

# Configure a listener component (replace 'component_name' with actual name)
listener = LISTENERS["desktop/component_name"]
listener.configure(callback=my_callback, your_other_arguments)

# Use the listener in a context manager
with listener.session as active_listener:
    # The listener is now running and will call my_callback when events occur
    pass  # Your main code here

Note: The callback argument is required. The on_configure() method shown in the documentation is an internal method called by configure().

keyboard

Bases: Listener

Keyboard event listener that captures key press and release events.

This listener wraps pynput's KeyboardListener to provide keyboard event monitoring with OWA's listener interface.

Examples:

>>> def on_key_event(event):
...     print(f"Key {event.vk} was {event.event_type}")
>>> listener = KeyboardListenerWrapper().configure(callback=on_key_event)
>>> listener.start()

mouse

Bases: Listener

Mouse event listener that captures mouse movement, clicks, and scroll events.

This listener wraps pynput's MouseListener to provide mouse event monitoring with OWA's listener interface.

Examples:

>>> def on_mouse_event(event):
...     print(f"Mouse {event.event_type} at ({event.x}, {event.y})")
>>> listener = MouseListenerWrapper().configure(callback=on_mouse_event)
>>> listener.start()

raw_mouse

Bases: Listener

Raw mouse input listener using Windows WM_INPUT messages.

This listener captures high-definition mouse movement data directly from the HID stack, bypassing Windows pointer acceleration and screen resolution limits. Provides sub-pixel precision and unfiltered input data essential for gaming and precision applications.

Examples:

>>> def on_raw_mouse_event(event):
...     print(f"Raw mouse: dx={event.dx}, dy={event.dy}, flags={event.button_flags}")
>>> listener = RawMouseListener().configure(callback=on_raw_mouse_event)
>>> listener.start()
on_configure
on_configure()

Initialize the raw input capture system.

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/listeners.py
def on_configure(self):
    """Initialize the raw input capture system."""
    self.raw_input_capture = RawInputCapture()
    self.raw_input_capture.register_callback(self._on_raw_mouse_event)
loop
loop(stop_event, callback)

Start the raw input capture loop.

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/listeners.py
def loop(self, stop_event, callback):
    """Start the raw input capture loop."""
    # Store the callback for use in the raw input callback
    self._current_callback = callback

    if not self.raw_input_capture.start():
        raise RuntimeError("Failed to start raw input capture")

    # Keep the loop running while the capture is active
    try:
        # The Windows message loop in raw_input_capture handles events efficiently
        # We just need to wait for the stop event without artificial delays
        stop_event.wait()
    finally:
        self.raw_input_capture.stop()
        self._current_callback = None

keyboard_state

Bases: Listener

Periodically reports the current keyboard state.

This listener calls the callback function every second with the current keyboard state, including which keys are currently pressed.

Examples:

>>> def on_keyboard_state(state):
...     if state.buttons:
...         print(f"Keys pressed: {state.buttons}")
>>> listener = KeyboardStateListener().configure(callback=on_keyboard_state)
>>> listener.start()

mouse_state

Bases: Listener

Periodically reports the current mouse state.

This listener calls the callback function every second with the current mouse state, including position and pressed buttons.

Examples:

>>> def on_mouse_state(state):
...     print(f"Mouse at ({state.x}, {state.y}), buttons: {state.buttons}")
>>> listener = MouseStateListener().configure(callback=on_mouse_state)
>>> listener.start()

window

Bases: Listener

Periodically monitors and reports the currently active window.

This listener calls the callback function every second with information about the currently active window, including title, position, and handle.

Examples:

Monitor active window changes:

>>> def on_window_change(window):
...     if window:
...         print(f"Active window: {window.title}")
>>>
>>> listener = WindowListener().configure(callback=on_window_change)
>>> listener.start()
>>> # ... listener runs in background ...
>>> listener.stop()
>>> listener.join()

Track window focus for automation:

>>> def track_focus(window):
...     if window and "notepad" in window.title.lower():
...         print("Notepad is now active!")
>>>
>>> listener = WindowListener().configure(callback=track_focus)
>>> listener.start()