A Chrome extension and CLI that let your agents control your actual browser — with logins, extensions, and cookies already there. No headless instance, no bot detection, no extra memory. Star on GitHub.

Other browser MCPs spawn a fresh Chrome — no logins, no extensions, instantly flagged by bot detectors, double the memory. RunBrowser connects to your running browser instead. One Chrome extension, full CDP access, everything you're already logged into.

Getting started

Three steps and your agent is browsing.

Install the CLI:

1npm i -g @jiweiyuan/runbrowser

Load the extension in Chrome: open chrome://extensions/, enable Developer mode, click Load unpacked, select the packages/extension/dist folder
Click the extension icon on a tab — it turns green. Start using it:

123runbrowser navigate https://example.com
runbrowser snapshot
runbrowser click @e5

No session management needed — sessions are auto-created on first command. The extension connects your browser to a local WebSocket relay on localhost:19988. The CLI sends CDP commands through the relay. No remote servers, no accounts, nothing leaves your machine.

Then install the skill — it teaches your agent how to use RunBrowser:

1npx -y skills add runbrowser/runbrowser

Extension icon green = connected. Gray = not attached to this tab.

How it works

Click the extension icon on a tab — it attaches via chrome.debugger and opens a WebSocket to a local relay. Your agent (CLI or MCP) connects to the same relay. CDP commands flow directly through the Extension to Chrome — no Playwright, no middleman.

┌─────────────────────┐     ┌──────────────────────┐     ┌─────────────────┐
│   BROWSER           │     │   LOCALHOST          │     │   CLIENT        │
│                     │     │                      │     │                 │
│  ┌───────────────┐  │     │ WebSocket Server     │     │  ┌───────────┐  │
│  │   Extension   │<───────┬───>  :19988          │     │  │ CLI / MCP │  │
│  └───────┬───────┘  │ WS  │                      │     │  └───────────┘  │
│          │          │     │  /extension          │     │        │        │
│    chrome.debugger  │     │       │              │     │        v        │
│          v          │     │       v              │     │  ┌────────────┐ │
│  ┌───────────────┐  │     │  CDPExecutor         │     │  │  HTTP API  │ │
│  │ Tab 1 (green) │  │     │  (direct CDP)        │     │  │  /api/*    │ │
│  │ Tab 2 (green) │  │     └──────────────────────┘     │  └────────────┘ │
│  │ Tab 3 (gray)  │  │                                  │                 │
└─────────────────────┘     Tab 3 not controlled         └─────────────────┘

The relay multiplexes sessions, so multiple agents or CLI instances can work with the same browser at the same time. All browser commands are executed directly via CDP — no Playwright dependency.

Collaboration

Because the agent works in your browser, you can collaborate. You see everything it does in real time — every click gets a green highlight on the target element, every command flashes a subtle border glow. When it hits a captcha, you solve it. When a consent wall appears, you click through it. When the agent gets stuck, you disable the extension on that tab, fix things manually, re-enable it, and the agent picks up where it left off.

You're not watching a remote screen or reading logs after the fact. You're sharing a browser — the agent does the repetitive work, you step in when it needs a human.

Accessibility snapshots

Your agent needs to see the page before it can act. Accessibility snapshots return every interactive element as text, with @ref labels attached. 5–20KB instead of 100KB+ for a screenshot — cheaper, faster, and the agent can parse them without vision.

1234567891011runbrowser snapshot

# Output:
# - banner:
#     - link "Home" @e1
#     - navigation:
#         - link "Docs" @e2
#         - link "Blog" @e3
# - main:
#     - heading "Welcome" @e4
#     - button "Get started" @e5

Each line ends with a @ref you can pass directly to commands. Use snapshots as the primary way to read pages. Only reach for screenshots when spatial layout matters.

12345678910# Interact using refs from snapshot
runbrowser click @e5
runbrowser fill @e3 "search term"
runbrowser get text @e4

# Filter to interactive elements only
runbrowser snapshot -i

# Scope to a CSS selector
runbrowser snapshot -S "main"

CLI commands

RunBrowser provides 50+ commands organized by category. Every command supports --json output and auto-session creation.

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960# Navigation
runbrowser navigate https://example.com
runbrowser back
runbrowser forward
runbrowser reload
runbrowser close                           # close session

# Observation
runbrowser snapshot                        # accessibility tree with @refs
runbrowser snapshot -i                     # interactive elements only
runbrowser screenshot shot.png             # save screenshot
runbrowser get url                         # current URL
runbrowser get title                       # page title
runbrowser get text @e5                    # element text content
runbrowser get html @e5                    # element HTML
runbrowser get attr @e5 --attr-name href   # attribute value
runbrowser get count @e5                   # count matching elements
runbrowser is visible @e5                  # check element state
runbrowser is checked @e5
runbrowser is enabled @e5

# Interaction
runbrowser click @e5                       # click element
runbrowser dblclick @e5                    # double-click
runbrowser fill @e3 "hello world"          # clear + fill input
runbrowser type "search query"             # type at current focus
runbrowser press Enter                     # press key
runbrowser select @e5 "option-value"       # select dropdown option
runbrowser check @e5                       # check checkbox
runbrowser uncheck @e5                     # uncheck checkbox
runbrowser scroll down                     # scroll direction
runbrowser scroll down 500                 # scroll by pixels
runbrowser hover @e5                       # hover element
runbrowser focus @e5                       # focus element
runbrowser upload @e5 ./file.png           # upload files
runbrowser drag @e1 @e2                    # drag source to target
runbrowser viewport 1280 720              # set viewport size

# Wait conditions
runbrowser wait @e5                        # wait for element visible
runbrowser wait 2000                       # wait milliseconds
runbrowser wait --text "Welcome"           # wait for text
runbrowser wait --url "**/dashboard"       # wait for URL pattern
runbrowser wait --load networkidle         # wait for load state

# Semantic locators
runbrowser find role button click --name "Submit"
runbrowser find text "Sign in" click
runbrowser find label "Email" fill "user@example.com"

# Tab & frame management
runbrowser tab list                        # list all tabs
runbrowser tab new https://example.com     # open new tab
runbrowser tab 2                           # switch to tab index
runbrowser frame "iframe#embed"            # switch to iframe
runbrowser frame main                      # return to main frame

# Execution
runbrowser eval 'document.title'           # run JS in browser
runbrowser cdp Page.captureScreenshot '{}' # raw CDP command

Flat commands for hot path. Subgroups for management. eval for anything else.

Site commands

Turn any website into a CLI command. Site commands are TypeScript plugins that encapsulate navigation, scraping, and data extraction into reusable commands. Instead of telling your agent to navigate, snapshot, and parse — run one command and get structured JSON back.

1234567891011# Get GitHub trending repos as structured data
runbrowser github trending --limit 5

# RANK  NAME                 STARS   LANGUAGE
# ---   ----                 -----   --------
# 1     denoland/deno        5.2k    Rust
# 2     tauri-apps/tauri     3.8k    Rust
# 3     nickel-org/nickel    2.1k    Rust

# JSON output for agents
runbrowser github trending --limit 3 --json

Write your own by dropping a .ts file into ~/.runbrowser/commands/. Full TypeScript, full IDE support, no YAML wrappers.

123456789101112131415161718192021// ~/.runbrowser/commands/github/trending.ts

export const description = 'GitHub trending repositories'
export const columns = ['rank', 'name', 'stars', 'language']

export const args = {
  limit: { type: 'number', default: 20, description: 'Number of items' },
  language: { type: 'string', description: 'Filter by language' },
}

export async function run(ctx, args) {
  await ctx.navigate('https://github.com/trending')
  const data = await ctx.evaluate(`
    [...document.querySelectorAll('article.Box-row')].map(el => ({
      name: el.querySelector('h2 a')?.textContent?.trim(),
      stars: el.querySelector('.octicon-star')?.parentElement?.textContent?.trim(),
      language: el.querySelector('[itemprop="programmingLanguage"]')?.textContent?.trim(),
    }))
  `)
  return data.slice(0, args.limit).map((item, i) => ({ rank: i + 1, ...item }))
}

Commands are loaded by the relay server via jiti — the CLI is just an HTTP client. Output supports --json, --csv, table, markdown, and YAML formats.

TypeScript plugins. Structured data. One command instead of navigate → snapshot → parse.

Command extensions

Install community-maintained commands from the runbrowser/commands repo. No cloning, no build steps — just install and use.

123456789101112131415161718192021222324# List available command extensions
runbrowser commands list

# Available command extensions:
#
#   reddit
#   youtube
#   x ✓ installed
#   hackernews
#   producthunt

# Install an extension
runbrowser commands install reddit
# ✓ Installed reddit/
#   → ~/.runbrowser/commands/reddit/hot.ts
#   → ~/.runbrowser/commands/reddit/search.ts

# Use it immediately
runbrowser reddit hot --limit 5
runbrowser reddit search "browser automation"

# Uninstall
runbrowser commands uninstall reddit
# ✓ Uninstalled reddit/

Community commands are downloaded as TypeScript files into ~/.runbrowser/commands/<site>/. They follow the same format as custom site commands — you can read, modify, or fork them.

How it works: The CLI fetches .ts files from the runbrowser/commands GitHub repo and saves them locally. The relay server loads them via jiti at runtime — no compilation needed. Each extension is a directory with one or more command files.

Contribute your own: Create a directory in the runbrowser/commands repo with your .ts files and submit a PR. Your commands become available to everyone via runbrowser commands install.

Consistent subgroup pattern. Community-maintained. Same TypeScript format as custom commands.

Sessions

Sessions are auto-created — just run a command and it works. For advanced use, manage sessions explicitly.

123runbrowser session new              # create session, outputs id
runbrowser session list             # show sessions + state keys
runbrowser session delete 1         # delete a session

Run multiple agents at once without them stepping on each other. Each session is an isolated sandbox with its own state. Browser tabs are shared, but session state is not.

1234567# Explicit session targeting
runbrowser navigate https://a.com -s 1
runbrowser navigate https://b.com -s 2

# Or set a default session
export RUNBROWSER_SESSION=1
runbrowser snapshot                 # uses session 1

Screen recording

Have the agent record what it's doing as MP4 video. The recording uses chrome.tabCapture and runs in the extension context, so it survives page navigation.

1234567891011# Start recording
runbrowser record start -o recording.mp4

# Navigate, interact — recording continues across pages
runbrowser navigate https://example.com
runbrowser click @e5

# Stop and save
runbrowser record stop
# Recording saved: recording.mp4
#   duration: 12.3s, size: 2.45 MB

For automated recording without clicking the extension icon, restart Chrome with special flags:

12runbrowser config set profile "Profile 11"
open -a "Google Chrome" --args --auto-accept-this-tab-capture --profile-directory="Profile 11"

Native tab capture. Survives navigation. Auto-transcodes to H.264 MP4.

MCP integration

RunBrowser uses a two-tool MCP model — skill and run. No tool sprawl, no schema bloat.

12345678{
  "mcpServers": {
    "runbrowser": {
      "command": "npx",
      "args": ["-y", "@jiweiyuan/runbrowser-mcp@latest"]
    }
  }
}

The agent calls skill to learn available commands, then run to execute them:

skill()  → returns full CLI documentation + site commands
run({ command: "navigate https://example.com" })
run({ command: "snapshot" })
run({ command: "click @e1" })
run({ command: "eval document.title" })
run({ command: "github trending --limit 5" })

The run tool follows the same syntax as the CLI. The agent doesn't need to learn a separate API.

┌──────────────────────────────────────────────────────────────────────┐
│                         MCP Client                                  │
│                  (Claude, Cursor, Windsurf...)                      │
└──────────────┬──────────────────────────────┬────────────────────────┘
             │                              │
        skill()                        run(command)
             │                              │
             v                              v
┌──────────────────────────────────────────────────────────────────────┐
│                       RunBrowser MCP Server                         │
│                                                                     │
│   ┌─────────────────┐              ┌──────────────────────────┐     │
│   │   skill tool     │              │      run tool            │     │
│   │                 │              │                          │     │
│   │  Returns CLI    │              │  Parses command string   │     │
│   │  docs + site    │              │  Dispatches to relay     │     │
│   │  command list   │              │  Returns result          │     │
│   └─────────────────┘              └────────────┬─────────────┘     │
│                                                 │                   │
└─────────────────────────────────────────────────┼───────────────────┘
                                                │
                                          HTTP / WS
                                                │
                                                v
                                  ┌─────────────────────────┐
                                  │   Relay :19988          │
                                  │   CDPExecutor           │
                                  └────────────┬────────────┘
                                               │
                                          chrome.debugger
                                               │
                                               v
                                  ┌─────────────────────────┐
                                  │   Your Chrome Browser   │
                                  └─────────────────────────┘

Comparison

Why use this over the alternatives.

vs Playwright MCP

	Playwright MCP	RunBrowser
Browser	Spawns new Chrome	Uses your Chrome
Extensions	None	Your existing ones
Login state	Fresh	Already logged in
Bot detection	Always detected	Can bypass
Collaboration	Separate window	Same browser as user

vs BrowserUse

	BrowserUse	RunBrowser
Language	Python	TypeScript / Node.js
Browser	Spawns new Chrome (Playwright)	Uses your Chrome
Approach	AI agent framework (LLM decides)	CLI + MCP tools (agent sends commands)
Login state	Fresh	Already logged in
CLI	No	Yes — 50+ commands

vs Agent Browser

	Agent Browser	RunBrowser
Browser	Spawns headless Chromium	Your running Chrome
Commands	~90 (bloated)	~50 (focused)
Command extensions	No	Yes — install community commands
CLI binary	Rust CLI → Node daemon	Node CLI → relay (simpler)
Bot detection	Always detected	Can bypass
Real tabs	No	Yes

Remote access

Control Chrome on a remote machine — a headless Mac mini, a cloud VM, a devcontainer. Start the relay server with a public bind and a token.

1234567# On the host machine — start relay server
runbrowser serve --host 0.0.0.0 --token <secret>

# From anywhere — set env vars and use normally
export RUNBROWSER_HOST=192.168.1.10
export RUNBROWSER_TOKEN=<secret>
runbrowser navigate https://example.com

Also works for devcontainers and Docker — use RUNBROWSER_HOST=host.docker.internal. Works for MCP too — set RUNBROWSER_HOST and RUNBROWSER_TOKEN in your MCP client env config.

Security

Everything runs on your machine. The relay binds to localhost:19988 and only accepts connections from the extension. No remote server, no account, no telemetry.

Local only — WebSocket server binds to localhost. Nothing leaves your machine.
Origin validation — only the RunBrowser extension origin is accepted. Browsers cannot spoof the Origin header, so malicious websites cannot connect.
Explicit consent — only tabs where you clicked the extension icon are controlled. No background access.
Visible automation — Chrome shows an automation banner on controlled tabs. Every agent action gets visual feedback (green highlight).