Desktop Environment¶

The Desktop Environment module (owa.env.desktop) extends Open World Agents by providing functionalities that interact with the operating system's desktop. It focuses on user interface interactions and input simulation.

Features¶

Screen Capture: Capture the current screen using CALLABLES["desktop/screen.capture"].
Window Management: Retrieve information about active windows and search for windows by title using functions like CALLABLES["desktop/window.get_active_window"] and CALLABLES["desktop/window.get_window_by_title"].
Input Simulation: Simulate mouse actions (e.g., CALLABLES["desktop/mouse.click"]) and set up keyboard listeners to handle input events.

Usage¶

The Desktop Environment module is automatically available when you install owa-env-desktop. No manual activation needed!

# Components automatically available after installation
from owa.core.registry import CALLABLES, LISTENERS

You can access desktop functionalities via the global registries using the unified namespace/name pattern:

print(CALLABLES["desktop/screen.capture"]().shape)  # Capture and display screen dimensions
print(CALLABLES["desktop/window.get_active_window"]())  # Retrieve the active window

This module is essential for applications that require integration with desktop UI elements and user input simulation.

Implementation Details¶

To see detailed implementation, skim over owa-env-desktop. API documentation is currently being developed.

Available Functions¶

Mouse Functions¶

desktop/mouse.click - Simulate a mouse click
desktop/mouse.move - Move the mouse cursor to specified coordinates
desktop/mouse.position - Get the current mouse position
desktop/mouse.press - Simulate pressing a mouse button
desktop/mouse.release - Simulate releasing a mouse button
desktop/mouse.scroll - Simulate mouse wheel scrolling

Keyboard Functions¶

desktop/keyboard.press - Simulate pressing a keyboard key
desktop/keyboard.release - Simulate releasing a keyboard key
desktop/keyboard.type - Type a string of characters
desktop/keyboard.press_repeat - Simulate repeat-press when pressing key long time

Screen Functions¶

desktop/screen.capture - Capture the current screen (Note: This module utilizes bettercam. For better performance and extensibility, use owa-env-gst's functions instead)

Window Functions¶

desktop/window.get_active_window - Get the currently active window
desktop/window.get_window_by_title - Find a window by its title
desktop/window.when_active - Run a function when a specific window becomes active

Available Listeners¶

desktop/keyboard - Listen for keyboard events
desktop/mouse - Listen for mouse events

Misc¶

Library Selection Rationale¶

This module utilizes pynput for input simulation after evaluating several alternatives:

Why not PyAutoGUI? Though widely used, PyAutoGUI uses deprecated Windows APIs (keybd_event/mouse_event) rather than the modern SendInput method. These older APIs fail in DirectX applications and games. Additionally, PyAutoGUI has seen limited maintenance (last significant update was over 2 years ago).
Alternative Solutions: Libraries like pydirectinput and pydirectinput_rgx address the Windows API issue by using SendInput, but they lack input capturing capabilities which are essential for our use case.
Other Options: We also evaluated keyboard and mouse libraries but found them inadequately maintained with several unresolved bugs that could impact reliability.

Input Auto-Repeat Functionality¶

For simulating key auto-repeat behavior, use the dedicated function:

CALLABLES["desktop/keyboard.press_repeat"](key, press_time: float, initial_delay: float = 0.5, repeat_delay: float = 0.033)

This function handles the complexity of simulating hardware auto-repeat, with configurable initial delay before repeating starts and the interval between repeated keypresses.

Auto-generated documentation¶

desktop plugin 0.3.9.post1 ¶

Desktop environment plugin with mouse, keyboard, and window control

Author: OWA Development Team

Callables ¶

Usage: To use callable components, import CALLABLES from owa.core and access them by their component name:

from owa.core import CALLABLES

# Access a callable component (replace 'component_name' with actual name)
callable_func = CALLABLES["desktop/component_name"]
result = callable_func(your_arguments)

screen.capture ¶

capture_screen() -> ndarray

Capture the current screen as a numpy array.

Returns:

Type	Description
`ndarray`	numpy.ndarray: Screen capture as BGR image array with shape (height, width, 3).

Examples:

>>> screen = capture_screen()
>>> print(f"Screen dimensions: {screen.shape}")  # e.g., (1080, 1920, 3)
>>> # Save to file: cv2.imwrite('screenshot.png', screen)

Source code in projects/owa-env-desktop/owa/env/desktop/screen/callables.py

def capture_screen() -> np.ndarray:
    """
    Capture the current screen as a numpy array.

    Returns:
        numpy.ndarray: Screen capture as BGR image array with shape (height, width, 3).

    Examples:
        >>> screen = capture_screen()
        >>> print(f"Screen dimensions: {screen.shape}")  # e.g., (1080, 1920, 3)
        >>> # Save to file: cv2.imwrite('screenshot.png', screen)
    """
    import bettercam

    camera = bettercam.create()
    return camera.grab()

mouse.click ¶

click(button: str | Button, count: int) -> None

Simulate a mouse click.

Parameters:

Name	Type	Description	Default
`button`	`str \| Button`	Mouse button to click. Can be "left", "middle", "right" or a Button enum.	required
`count`	`int`	Number of clicks to perform.	required

Examples:

>>> click("left", 1)  # Single left click
>>> click("right", 2)  # Double right click

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def click(button: str | Button, count: int) -> None:
    """
    Simulate a mouse click.

    Args:
        button: Mouse button to click. Can be "left", "middle", "right" or a Button enum.
        count: Number of clicks to perform.

    Examples:
        >>> click("left", 1)  # Single left click
        >>> click("right", 2)  # Double right click
    """
    if button in ("left", "middle", "right"):
        button = getattr(Button, button)
    return mouse_controller.click(button, count)

mouse.move ¶

mouse_move(x: int, y: int) -> None

Move the mouse cursor to specified coordinates.

Parameters:

Name	Type	Description	Default
`x`	`int`	X coordinate to move to.	required
`y`	`int`	Y coordinate to move to.	required

Examples:

>>> mouse_move(100, 200)  # Move mouse to position (100, 200)

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def mouse_move(x: int, y: int) -> None:
    """
    Move the mouse cursor to specified coordinates.

    Args:
        x: X coordinate to move to.
        y: Y coordinate to move to.

    Examples:
        >>> mouse_move(100, 200)  # Move mouse to position (100, 200)
    """
    return mouse_controller.move(x, y)

mouse.position ¶

mouse_position() -> tuple[int, int]

Get the current mouse cursor position.

Returns:

Type	Description
`tuple[int, int]`	Tuple of (x, y) coordinates of the mouse cursor.

Examples:

>>> x, y = mouse_position()
>>> print(f"Mouse is at ({x}, {y})")

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def mouse_position() -> tuple[int, int]:
    """
    Get the current mouse cursor position.

    Returns:
        Tuple of (x, y) coordinates of the mouse cursor.

    Examples:
        >>> x, y = mouse_position()
        >>> print(f"Mouse is at ({x}, {y})")
    """
    return mouse_controller.position

mouse.press ¶

mouse_press(button: str | Button) -> None

Press and hold a mouse button.

Parameters:

Name	Type	Description	Default
`button`	`str \| Button`	Mouse button to press. Can be "left", "middle", "right" or a Button enum.	required

Examples:

>>> mouse_press("left")  # Press and hold left mouse button

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def mouse_press(button: str | Button) -> None:
    """
    Press and hold a mouse button.

    Args:
        button: Mouse button to press. Can be "left", "middle", "right" or a Button enum.

    Examples:
        >>> mouse_press("left")  # Press and hold left mouse button
    """
    return mouse_controller.press(button)

mouse.release ¶

mouse_release(button: str | Button) -> None

Release a previously pressed mouse button.

Parameters:

Name	Type	Description	Default
`button`	`str \| Button`	Mouse button to release. Can be "left", "middle", "right" or a Button enum.	required

Examples:

>>> mouse_release("left")  # Release left mouse button

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def mouse_release(button: str | Button) -> None:
    """
    Release a previously pressed mouse button.

    Args:
        button: Mouse button to release. Can be "left", "middle", "right" or a Button enum.

    Examples:
        >>> mouse_release("left")  # Release left mouse button
    """
    return mouse_controller.release(button)

mouse.scroll ¶

mouse_scroll(x: int, y: int, dx: int, dy: int) -> None

Simulate mouse wheel scrolling.

Parameters:

Name	Type	Description	Default
`x`	`int`	X coordinate where scrolling occurs.	required
`y`	`int`	Y coordinate where scrolling occurs.	required
`dx`	`int`	Horizontal scroll amount.	required
`dy`	`int`	Vertical scroll amount.	required

Examples:

>>> mouse_scroll(100, 100, 0, 3)  # Scroll up 3 units at position (100, 100)
>>> mouse_scroll(100, 100, 0, -3)  # Scroll down 3 units

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def mouse_scroll(x: int, y: int, dx: int, dy: int) -> None:
    """
    Simulate mouse wheel scrolling.

    Args:
        x: X coordinate where scrolling occurs.
        y: Y coordinate where scrolling occurs.
        dx: Horizontal scroll amount.
        dy: Vertical scroll amount.

    Examples:
        >>> mouse_scroll(100, 100, 0, 3)  # Scroll up 3 units at position (100, 100)
        >>> mouse_scroll(100, 100, 0, -3)  # Scroll down 3 units
    """
    return mouse_controller.scroll(x, y, dx, dy)

mouse.get_state ¶

get_mouse_state() -> MouseState

Get the current mouse state including position and pressed buttons.

Returns:

Type	Description
`MouseState`	MouseState object containing current mouse position and pressed buttons.

Examples:

>>> state = get_mouse_state()
>>> print(f"Mouse at ({state.x}, {state.y}), buttons: {state.buttons}")

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def get_mouse_state() -> MouseState:
    """
    Get the current mouse state including position and pressed buttons.

    Returns:
        MouseState object containing current mouse position and pressed buttons.

    Examples:
        >>> state = get_mouse_state()
        >>> print(f"Mouse at ({state.x}, {state.y}), buttons: {state.buttons}")
    """
    position = mouse_controller.position
    if position is None:
        position = (-1, -1)  # Fallback if position cannot be retrieved
    mouse_buttons = set()
    buttons = get_vk_state()
    for button, vk in {"left": 1, "right": 2, "middle": 4}.items():
        if vk in buttons:
            mouse_buttons.add(button)
    return MouseState(x=position[0], y=position[1], buttons=mouse_buttons)

keyboard.press ¶

press(key: str | int) -> None

Press and hold a keyboard key.

Parameters:

Name	Type	Description	Default
`key`	`str \| int`	Key to press. Can be a string (e.g., 'a', 'enter') or virtual key code.	required

Examples:

>>> press('a')  # Press and hold the 'a' key
>>> press(65)  # Press and hold the 'a' key using virtual key code

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def press(key: str | int) -> None:
    """
    Press and hold a keyboard key.

    Args:
        key: Key to press. Can be a string (e.g., 'a', 'enter') or virtual key code.

    Examples:
        >>> press('a')  # Press and hold the 'a' key
        >>> press(65)  # Press and hold the 'a' key using virtual key code
    """
    key = vk_to_keycode(key) if isinstance(key, int) else key
    return keyboard_controller.press(key)

keyboard.release ¶

release(key: str | int) -> None

Release a previously pressed keyboard key.

Parameters:

Name	Type	Description	Default
`key`	`str \| int`	Key to release. Can be a string (e.g., 'a', 'enter') or virtual key code.	required

Examples:

>>> release('a')  # Release the 'a' key
>>> release(65)  # Release the 'a' key using virtual key code

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def release(key: str | int) -> None:
    """
    Release a previously pressed keyboard key.

    Args:
        key: Key to release. Can be a string (e.g., 'a', 'enter') or virtual key code.

    Examples:
        >>> release('a')  # Release the 'a' key
        >>> release(65)  # Release the 'a' key using virtual key code
    """
    key = vk_to_keycode(key) if isinstance(key, int) else key
    return keyboard_controller.release(key)

keyboard.type ¶

keyboard_type(text: str) -> None

Type a string of characters.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text string to type.	required

Examples:

>>> keyboard_type("Hello, World!")  # Types the text
>>> keyboard_type("user@example.com")  # Types an email address

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def keyboard_type(text: str) -> None:
    """
    Type a string of characters.

    Args:
        text: Text string to type.

    Examples:
        >>> keyboard_type("Hello, World!")  # Types the text
        >>> keyboard_type("user@example.com")  # Types an email address
    """
    return keyboard_controller.type(text)

keyboard.get_state ¶

get_keyboard_state() -> KeyboardState

Get the current keyboard state including pressed keys.

Returns:

Type	Description
`KeyboardState`	KeyboardState object containing currently pressed keys.

Examples:

>>> state = get_keyboard_state()
>>> print(f"Pressed keys: {state.buttons}")

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def get_keyboard_state() -> KeyboardState:
    """
    Get the current keyboard state including pressed keys.

    Returns:
        KeyboardState object containing currently pressed keys.

    Examples:
        >>> state = get_keyboard_state()
        >>> print(f"Pressed keys: {state.buttons}")
    """
    return KeyboardState(buttons=get_vk_state())

keyboard.press_repeat ¶

press_repeat_key(
    key: str | int,
    press_time: float,
    initial_delay: float = 0.5,
    repeat_delay: float = 0.033,
) -> None

Simulate the behavior of holding a key down with auto-repeat.

Parameters:

Name	Type	Description	Default
`key`	`str \| int`	Key to press repeatedly. Can be a string or virtual key code.	required
`press_time`	`float`	Total time to hold the key down in seconds.	required
`initial_delay`	`float`	Initial delay before auto-repeat starts (default: 0.5s).	`0.5`
`repeat_delay`	`float`	Delay between repeated key presses (default: 0.033s).	`0.033`

Examples:

>>> press_repeat_key('a', 2.0)  # Hold 'a' key for 2 seconds with auto-repeat
>>> press_repeat_key('space', 1.5, 0.3, 0.05)  # Custom timing for space key

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def press_repeat_key(
    key: str | int, press_time: float, initial_delay: float = 0.5, repeat_delay: float = 0.033
) -> None:
    """
    Simulate the behavior of holding a key down with auto-repeat.

    Args:
        key: Key to press repeatedly. Can be a string or virtual key code.
        press_time: Total time to hold the key down in seconds.
        initial_delay: Initial delay before auto-repeat starts (default: 0.5s).
        repeat_delay: Delay between repeated key presses (default: 0.033s).

    Examples:
        >>> press_repeat_key('a', 2.0)  # Hold 'a' key for 2 seconds with auto-repeat
        >>> press_repeat_key('space', 1.5, 0.3, 0.05)  # Custom timing for space key
    """
    key = vk_to_keycode(key) if isinstance(key, int) else key
    repeat_time = max(0, (press_time - initial_delay) // repeat_delay - 1)

    keyboard_controller.press(key)
    time.sleep(initial_delay)
    for _ in range(int(repeat_time)):
        keyboard_controller.press(key)
        time.sleep(repeat_delay)
    keyboard_controller.release(key)

keyboard.release_all_keys ¶

release_all_keys() -> None

Release all currently pressed keys on the keyboard.

Examples:

>>> release_all_keys()  # Release all pressed keys

Source code in projects/owa-env-desktop/owa/env/desktop/keyboard_mouse/callables.py

def release_all_keys() -> None:
    """
    Release all currently pressed keys on the keyboard.

    Examples:
        >>> release_all_keys()  # Release all pressed keys
    """
    keyboard_state: KeyboardState = get_keyboard_state()
    for key in keyboard_state.buttons:
        release(key)

window.get_active_window ¶

get_active_window() -> WindowInfo | None

Get information about the currently active window.

Returns:

Type	Description
`WindowInfo \| None`	WindowInfo object containing title, position, and handle of the active window,
`WindowInfo \| None`	or None if no active window is found.

Examples:

>>> window = get_active_window()
>>> if window:
...     print(f"Active window: {window.title}")
...     print(f"Position: {window.rect}")

Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py

def get_active_window() -> WindowInfo | None:
    """
    Get information about the currently active window.

    Returns:
        WindowInfo object containing title, position, and handle of the active window,
        or None if no active window is found.

    Examples:
        >>> window = get_active_window()
        >>> if window:
        ...     print(f"Active window: {window.title}")
        ...     print(f"Position: {window.rect}")
    """
    if _IS_DARWIN:
        from Quartz import (
            CGWindowListCopyWindowInfo,
            kCGNullWindowID,
            kCGWindowListOptionOnScreenOnly,
        )

        windows = CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID)
        for window in windows:
            if window.get("kCGWindowLayer", 0) == 0:  # Frontmost window
                bounds = window.get("kCGWindowBounds")
                title = window.get("kCGWindowName", "")
                rect = (
                    int(bounds["X"]),
                    int(bounds["Y"]),
                    int(bounds["X"] + bounds["Width"]),
                    int(bounds["Y"] + bounds["Height"]),
                )
                hWnd = window.get("kCGWindowNumber", 0)
                return WindowInfo(title=title, rect=rect, hWnd=hWnd)
        return None

    elif _IS_WINDOWS:
        import pygetwindow as gw

        active_window = gw.getActiveWindow()
        if active_window is not None:
            rect = active_window._getWindowRect()
            title = active_window.title
            rect_coords = (rect.left, rect.top, rect.right, rect.bottom)
            hWnd = active_window._hWnd
            return WindowInfo(title=title, rect=rect_coords, hWnd=hWnd)
        return WindowInfo(title="", rect=[0, 0, 0, 0], hWnd=-1)
    else:
        raise NotImplementedError(f"Platform {_PLATFORM} is not supported yet")

window.get_window_by_title ¶

get_window_by_title(
    window_title_substring: str,
) -> WindowInfo

Find a window by searching for a substring in its title.

Parameters:

Name	Type	Description	Default
`window_title_substring`	`str`	Substring to search for in window titles.	required

Returns:

Type	Description
`WindowInfo`	WindowInfo object for the first matching window.

Raises:

Type	Description
`ValueError`	If no window with matching title is found.

Examples:

>>> window = get_window_by_title("notepad")
>>> print(f"Found window: {window.title}")

Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py

def get_window_by_title(window_title_substring: str) -> WindowInfo:
    """
    Find a window by searching for a substring in its title.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        WindowInfo object for the first matching window.

    Raises:
        ValueError: If no window with matching title is found.

    Examples:
        >>> window = get_window_by_title("notepad")
        >>> print(f"Found window: {window.title}")
    """
    if _IS_WINDOWS:
        import pygetwindow as gw

        windows = gw.getWindowsWithTitle(window_title_substring)
        if not windows:
            raise ValueError(f"No window with title containing '{window_title_substring}' found.")

        # Temporal workaround to deal with `cmd`'s behavior: it setup own title as the command it running.
        # e.g. `owl window find abcd` will always find `cmd` window itself running command.
        if "Conda" in windows[0].title:
            windows.pop(0)

        window = windows[0]  # NOTE: only return the first window matching the title
        rect = window._getWindowRect()
        return WindowInfo(
            title=window.title,
            rect=(rect.left, rect.top, rect.right, rect.bottom),
            hWnd=window._hWnd,
        )

    elif _IS_DARWIN:
        from Quartz import CGWindowListCopyWindowInfo, kCGNullWindowID, kCGWindowLayer, kCGWindowListOptionOnScreenOnly

        windows = CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID)
        for window in windows:
            # Skip windows that are not on normal level (like menu bars, etc)
            if window.get(kCGWindowLayer, 0) != 0:
                continue

            # Get window name from either kCGWindowName or kCGWindowOwnerName
            title = window.get("kCGWindowName", "")
            if not title:
                title = window.get("kCGWindowOwnerName", "")

            if title and window_title_substring.lower() in title.lower():
                bounds = window.get("kCGWindowBounds")
                if bounds:
                    return WindowInfo(
                        title=title,
                        rect=(
                            int(bounds["X"]),
                            int(bounds["Y"]),
                            int(bounds["X"] + bounds["Width"]),
                            int(bounds["Y"] + bounds["Height"]),
                        ),
                        hWnd=window.get("kCGWindowNumber", 0),
                    )

        raise ValueError(f"No window with title containing '{window_title_substring}' found.")
    else:
        # Linux or other OS (not implemented yet)
        raise NotImplementedError("Not implemented for Linux or other OS.")

window.get_pid_by_title ¶

get_pid_by_title(window_title_substring: str) -> int

Get the process ID (PID) of a window by its title.

Parameters:

Name	Type	Description	Default
`window_title_substring`	`str`	Substring to search for in window titles.	required

Returns:

Type	Description
`int`	Process ID of the window.

Examples:

>>> pid = get_pid_by_title("notepad")
>>> print(f"Notepad PID: {pid}")

Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py

def get_pid_by_title(window_title_substring: str) -> int:
    """
    Get the process ID (PID) of a window by its title.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        Process ID of the window.

    Examples:
        >>> pid = get_pid_by_title("notepad")
        >>> print(f"Notepad PID: {pid}")
    """
    window = get_window_by_title(window_title_substring)
    if _IS_WINDOWS:
        import win32process

        # win32process.GetWindowThreadProcessId returns (tid, pid)
        _, pid = win32process.GetWindowThreadProcessId(window.hWnd)
        return pid
    else:
        # Implement if needed for other OS
        raise NotImplementedError(f"Getting PID by title not implemented for {_PLATFORM}")

window.when_active ¶

when_active(window_title_substring: str) -> Callable

Decorator to run a function only when a specific window is active.

Parameters:

Name	Type	Description	Default
`window_title_substring`	`str`	Substring to search for in window titles.	required

Returns:

Type	Description
`Callable`	Decorator function that conditionally executes the wrapped function.

Examples:

>>> @when_active("notepad")
... def do_something():
...     print("Notepad is active!")

Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py

def when_active(window_title_substring: str) -> Callable:
    """
    Decorator to run a function only when a specific window is active.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        Decorator function that conditionally executes the wrapped function.

    Examples:
        >>> @when_active("notepad")
        ... def do_something():
        ...     print("Notepad is active!")
    """

    def decorator(func):
        def wrapper(*args, **kwargs):
            if is_active(window_title_substring):
                return func(*args, **kwargs)

        return wrapper

    return decorator

window.is_active ¶

is_active(window_title_substring: str) -> bool

Check if a window with the specified title substring is currently active.

Parameters:

Name	Type	Description	Default
`window_title_substring`	`str`	Substring to search for in window titles.	required

Returns:

Type	Description
`bool`	True if the window is active, False otherwise.

Examples:

>>> if is_active("notepad"):
...     print("Notepad is the active window")

Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py

def is_active(window_title_substring: str) -> bool:
    """
    Check if a window with the specified title substring is currently active.

    Args:
        window_title_substring: Substring to search for in window titles.

    Returns:
        True if the window is active, False otherwise.

    Examples:
        >>> if is_active("notepad"):
        ...     print("Notepad is the active window")
    """
    try:
        window = get_window_by_title(window_title_substring)
    except ValueError:
        return False
    active = get_active_window()
    return active is not None and active.hWnd == window.hWnd

window.make_active ¶

make_active(window_title_substring: str) -> None

Bring a window to the foreground and make it active.

Parameters:

Name	Type	Description	Default
`window_title_substring`	`str`	Substring to search for in window titles.	required

Raises:

Type	Description
`ValueError`	If no window with matching title is found.
`NotImplementedError`	If the operation is not supported on the current OS.

Examples:

>>> make_active("notepad")  # Brings notepad window to front

Source code in projects/owa-env-desktop/owa/env/desktop/window/callables.py

def make_active(window_title_substring: str) -> None:
    """
    Bring a window to the foreground and make it active.

    Args:
        window_title_substring: Substring to search for in window titles.

    Raises:
        ValueError: If no window with matching title is found.
        NotImplementedError: If the operation is not supported on the current OS.

    Examples:
        >>> make_active("notepad")  # Brings notepad window to front
    """

    os_name = platform.system()
    if os_name == "Windows":
        import pygetwindow as gw

        windows = gw.getWindowsWithTitle(window_title_substring)
        if not windows:
            raise ValueError(f"No window with title containing '{window_title_substring}' found.")

        # Temporal workaround to deal with `cmd`'s behavior: it setup own title as the command it running.
        # e.g. `owl window find abcd` will always find `cmd` window itself running command.
        if "Conda" in windows[0].title:
            windows.pop(0)

        window = windows[0]  # NOTE: only return the first window matching the title
        window.activate()
    else:
        raise NotImplementedError(f"Activation not implemented for this OS: {os_name}")

Listeners ¶

Usage: To use listener components, import LISTENERS from owa.core and call the configure() method with a callback function:

from owa.core import LISTENERS

# Configure a listener component (replace 'component_name' with actual name)
listener = LISTENERS["desktop/component_name"]
listener.configure(callback=my_callback, your_other_arguments)

# Use the listener in a context manager
with listener.session as active_listener:
    # The listener is now running and will call my_callback when events occur
    pass  # Your main code here

Note: The callback argument is required. The on_configure() method shown in the documentation is an internal method called by configure().

keyboard ¶

Bases: Listener

Keyboard event listener that captures key press and release events.

This listener wraps pynput's KeyboardListener to provide keyboard event monitoring with OWA's listener interface.

Examples:

>>> def on_key_event(event):
...     print(f"Key {event.vk} was {event.event_type}")
>>> listener = KeyboardListenerWrapper().configure(callback=on_key_event)
>>> listener.start()

mouse ¶

Bases: Listener

Mouse event listener that captures mouse movement, clicks, and scroll events.

This listener wraps pynput's MouseListener to provide mouse event monitoring with OWA's listener interface.

Examples:

>>> def on_mouse_event(event):
...     print(f"Mouse {event.event_type} at ({event.x}, {event.y})")
>>> listener = MouseListenerWrapper().configure(callback=on_mouse_event)
>>> listener.start()

keyboard_state ¶

Bases: Listener

Periodically reports the current keyboard state.

This listener calls the callback function every second with the current keyboard state, including which keys are currently pressed.

Examples:

>>> def on_keyboard_state(state):
...     if state.buttons:
...         print(f"Keys pressed: {state.buttons}")
>>> listener = KeyboardStateListener().configure(callback=on_keyboard_state)
>>> listener.start()

mouse_state ¶

Bases: Listener

Periodically reports the current mouse state.

This listener calls the callback function every second with the current mouse state, including position and pressed buttons.

Examples:

>>> def on_mouse_state(state):
...     print(f"Mouse at ({state.x}, {state.y}), buttons: {state.buttons}")
>>> listener = MouseStateListener().configure(callback=on_mouse_state)
>>> listener.start()

window ¶

Bases: Listener

Periodically monitors and reports the currently active window.

This listener calls the callback function every second with information about the currently active window, including title, position, and handle.

Examples:

Monitor active window changes:

>>> def on_window_change(window):
...     if window:
...         print(f"Active window: {window.title}")
>>>
>>> listener = WindowListener().configure(callback=on_window_change)
>>> listener.start()
>>> # ... listener runs in background ...
>>> listener.stop()
>>> listener.join()

Track window focus for automation:

>>> def track_focus(window):
...     if window and "notepad" in window.title.lower():
...         print("Notepad is now active!")
>>>
>>> listener = WindowListener().configure(callback=track_focus)
>>> listener.start()