MCP Directory

Cua Driver

Background computer-use MCP server that drives native macOS, Windows, and Linux desktop apps without stealing focus.

Unverified
stdio (local)
No auth
Rust

Add to your client

Copy the config for your MCP client and paste it into its config file.

Install / run
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)"

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "cua-driver": {
      "command": "cua-driver",
      "args": [
        "mcp"
      ]
    }
  }
}

Step-by-step guides: Add to Claude Desktop · Add to Cursor · Add to Windsurf

Before you start

  • macOS 14 (Sonoma) or newer for the macOS backend (Windows and pre-release Linux backends also supported)
  • cua-driver CLI + CuaDriver.app installed via the install.sh / install.ps1 one-liner
  • macOS: Accessibility and Screen Recording permissions granted to CuaDriver.app (verify with `cua-driver check_permissions`)
  • An MCP client such as Claude Code, Cursor, Codex, or a custom client

About Cua Driver

Cua Driver exposes native-desktop control as MCP tools so coding agents can operate real GUI apps in the background. Register it with Claude Code via claude mcp add --transport stdio cua-driver -- cua-driver mcp. The core workflow is launch/list a window, call get_window_state to snapshot the accessibility tree (and optional screenshot), act by element_index or pixel coordinates (click/type/scroll/hotkey), then re-snapshot to verify. Backgrounded-first design means synthetic input lands without raising the window or moving the user's cursor across Spaces.

Tools & capabilities (33)

get_window_state

Snapshot a window's accessibility tree (Screen-of-Marks markdown with each actionable element tagged [N]) plus an optional screenshot; required before acting by element_index.

get_desktop_state

Capture the overall desktop / multi-window state.

list_apps

App-level discovery: list installed / running applications.

list_windows

List the windows for a given pid.

launch_app

Launch (or focus) an app by bundle id; idempotent and returns the pid plus a windows array.

kill_app

Terminate a running application by pid.

bring_to_front

Bring an app/window to the foreground when foreground interaction is needed.

click

Click a target by element_index (AX/UIA) or by pixel x/y within a window, without raising it.

double_click

Double-click a target by element_index or pixel coordinates.

right_click

Right-click (context-menu) a target by element_index or pixel coordinates.

drag

Perform a drag gesture between coordinates / elements.

type_text

Type a string into the focused element / window.

press_key

Press a single key (e.g. Enter/Go to commit a field).

hotkey

Send a keyboard shortcut / modifier chord.

set_value

Set an element's value directly (workaround for apps where keyboard commit is a no-op, e.g. minimized Chrome).

scroll

Scroll within a window or element.

get_accessibility_tree

Return the raw accessibility tree for a window.

get_screen_size

Return the screen dimensions.

get_cursor_position

Return the current cursor position.

move_cursor

Move the cursor to a position.

zoom

Zoom into a region of a window capture for finer inspection.

page

Cross-platform paging helper for navigating large captures/content.

screenshot

Capture a screenshot; in Claude Code compat mode it requires pid and window_id and captures only that window.

check_permissions

Report whether Accessibility and Screen Recording TCC grants are enabled for CuaDriver.

get_config

Read driver configuration (e.g. capture_mode som/vision).

set_config

Set driver configuration such as capture_mode or capture_scope.

set_agent_cursor_enabled

Toggle the on-screen agent cursor overlay.

set_agent_cursor_motion

Configure agent cursor motion/animation.

set_agent_cursor_style

Configure the agent cursor style.

get_agent_cursor_state

Read the current agent cursor state.

health_report

Return a driver health/diagnostics report.

check_for_update

Read-only probe for whether a newer Cua Driver release is available.

replay_trajectory

Replay a previously recorded interaction trajectory (recording/replay tools) for demos and regressions.

When to use it

  • Give a coding agent (Claude Code, Cursor, Codex) the ability to operate native desktop apps while you keep using your computer.
  • Automate GUI tasks in apps without an API — Finder, Numbers, browsers, Electron/Tauri apps — by clicking and typing on the accessibility tree.
  • Run background desktop automation that doesn't steal focus or move the cursor across Spaces.
  • Record and replay interaction trajectories for demos or regression testing.

Security notes

Runs locally and drives the user's real desktop. On macOS it ships as an .app bundle (com.trycua.driver) and requires Accessibility and Screen Recording TCC grants in System Settings -> Privacy & Security; verify with `cua-driver check_permissions`. Requires macOS 14 (Sonoma) or newer. It can control any granted application, so only run it on machines where you trust the connected agent. No API key or remote auth is used.

Cua Driver FAQ

How do I add it to Claude Code?

Install the driver with the install.sh one-liner, then run `claude mcp add --transport stdio cua-driver -- cua-driver mcp`. For Claude Code's vision/computer-use grounding, register the compat server instead: `claude mcp add --transport stdio cua-computer-use -- cua-driver mcp --claude-code-computer-use-compat`.

Which platforms are supported?

macOS and Windows are supported with the same CLI/MCP server; Linux is available as a pre-release backend while platform testing continues. The macOS backend requires macOS 14 (Sonoma) or newer.

What permissions does it need?

On macOS, CuaDriver.app needs Accessibility and Screen Recording grants in System Settings -> Privacy & Security. Verify both are true with `cua-driver check_permissions`. No API key is required.

Does it steal my mouse or focus?

No. Cua Driver is backgrounded-first: synthetic clicks and keystrokes land on the target window without raising it, warping the cursor, or pulling you across Spaces.

Alternatives to Cua Driver

Compare all alternatives →

AI-powered task-management system for AI-driven development that drops into Cursor, Windsurf, Claude Code, and more.

Unverified
stdio (local)
API key
JavaScript
15 tools
Updated 2 months agoRepo

Self-hosted MCP server for Jira and Confluence Cloud and Server/Data Center.

Verified
stdio (local)
API key
Python
11 tools
Updated 2 months agoRepo

Create, read, and modify Excel workbooks with your AI agent — no Microsoft Excel required.

Unverified
stdio (local)
No auth
Python
25 tools
Updated 2 months agoRepo