A Chrome extension and CLI that let your agents control your actual browser — with logins, extensions, and cookies already there. No headless instance, no bot detection, no extra memory. Star on GitHub.

Other browser MCPs spawn a fresh Chrome — no logins, no extensions, instantly flagged by bot detectors, double the memory. RunBrowser connects to your running browser instead. One Chrome extension, full CDP access, everything you're already logged into.

Getting started

Three steps and your agent is browsing.

  1. Install the CLI:
npm i -g @jiweiyuan/runbrowser
  1. Load the extension in Chrome: open chrome://extensions/, enable Developer mode, click Load unpacked, select the packages/extension/dist folder

  2. Click the extension icon on a tab — it turns green. Start using it:

runbrowser navigate https://example.com runbrowser snapshot runbrowser click @e5

No session management needed — sessions are auto-created on first command. The extension connects your browser to a local WebSocket relay on localhost:19988. The CLI sends CDP commands through the relay. No remote servers, no accounts, nothing leaves your machine.

Then install the skill — it teaches your agent how to use RunBrowser:

npx -y skills add runbrowser/runbrowser

Extension icon green = connected. Gray = not attached to this tab.

How it works

Click the extension icon on a tab — it attaches via chrome.debugger and opens a WebSocket to a local relay. Your agent (CLI or MCP) connects to the same relay. CDP commands flow directly through the Extension to Chrome — no Playwright, no middleman.

┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────┐ │ BROWSER │ │ LOCALHOST │ │ CLIENT │ │ │ │ │ │ │ │ ┌───────────────┐ │ │ WebSocket Server │ │ ┌───────────┐ │ │ │ Extension │<───────┬───> :19988 │ │ │ CLI / MCP │ │ │ └───────┬───────┘ │ WS │ │ │ └───────────┘ │ │ │ │ │ /extension │ │ │ │ │ chrome.debugger │ │ │ │ │ v │ │ v │ │ v │ │ ┌────────────┐ │ │ ┌───────────────┐ │ │ CDPExecutor │ │ │ HTTP API │ │ │ │ Tab 1 (green) │ │ │ (direct CDP) │ │ │ /api/* │ │ │ │ Tab 2 (green) │ │ └──────────────────────┘ │ └────────────┘ │ │ │ Tab 3 (gray) │ │ │ │ └─────────────────────┘ Tab 3 not controlled └─────────────────┘

The relay multiplexes sessions, so multiple agents or CLI instances can work with the same browser at the same time. All browser commands are executed directly via CDP — no Playwright dependency.

Collaboration

Because the agent works in your browser, you can collaborate. You see everything it does in real time — every click gets a green highlight on the target element, every command flashes a subtle border glow. When it hits a captcha, you solve it. When a consent wall appears, you click through it. When the agent gets stuck, you disable the extension on that tab, fix things manually, re-enable it, and the agent picks up where it left off.

You're not watching a remote screen or reading logs after the fact. You're sharing a browser — the agent does the repetitive work, you step in when it needs a human.

Accessibility snapshots

Your agent needs to see the page before it can act. Accessibility snapshots return every interactive element as text, with @ref labels attached. 5–20KB instead of 100KB+ for a screenshot — cheaper, faster, and the agent can parse them without vision.

runbrowser snapshot # Output: # - banner: # - link "Home" @e1 # - navigation: # - link "Docs" @e2 # - link "Blog" @e3 # - main: # - heading "Welcome" @e4 # - button "Get started" @e5

Each line ends with a @ref you can pass directly to commands. Use snapshots as the primary way to read pages. Only reach for screenshots when spatial layout matters.

# Interact using refs from snapshot runbrowser click @e5 runbrowser fill @e3 "search term" runbrowser get text @e4 # Filter to interactive elements only runbrowser snapshot -i # Scope to a CSS selector runbrowser snapshot -S "main"

CLI commands

RunBrowser provides 50+ commands organized by category. Every command supports --json output and auto-session creation.

# Navigation runbrowser navigate https://example.com runbrowser back runbrowser forward runbrowser reload runbrowser close # close session # Observation runbrowser snapshot # accessibility tree with @refs runbrowser snapshot -i # interactive elements only runbrowser screenshot shot.png # save screenshot runbrowser get url # current URL runbrowser get title # page title runbrowser get text @e5 # element text content runbrowser get html @e5 # element HTML runbrowser get attr @e5 --attr-name href # attribute value runbrowser get count @e5 # count matching elements runbrowser is visible @e5 # check element state runbrowser is checked @e5 runbrowser is enabled @e5 # Interaction runbrowser click @e5 # click element runbrowser dblclick @e5 # double-click runbrowser fill @e3 "hello world" # clear + fill input runbrowser type "search query" # type at current focus runbrowser press Enter # press key runbrowser select @e5 "option-value" # select dropdown option runbrowser check @e5 # check checkbox runbrowser uncheck @e5 # uncheck checkbox runbrowser scroll down # scroll direction runbrowser scroll down 500 # scroll by pixels runbrowser hover @e5 # hover element runbrowser focus @e5 # focus element runbrowser upload @e5 ./file.png # upload files runbrowser drag @e1 @e2 # drag source to target runbrowser viewport 1280 720 # set viewport size # Wait conditions runbrowser wait @e5 # wait for element visible runbrowser wait 2000 # wait milliseconds runbrowser wait --text "Welcome" # wait for text runbrowser wait --url "**/dashboard" # wait for URL pattern runbrowser wait --load networkidle # wait for load state # Semantic locators runbrowser find role button click --name "Submit" runbrowser find text "Sign in" click runbrowser find label "Email" fill "user@example.com" # Tab & frame management runbrowser tab list # list all tabs runbrowser tab new https://example.com # open new tab runbrowser tab 2 # switch to tab index runbrowser frame "iframe#embed" # switch to iframe runbrowser frame main # return to main frame # Execution runbrowser eval 'document.title' # run JS in browser runbrowser cdp Page.captureScreenshot '{}' # raw CDP command

Flat commands for hot path. Subgroups for management. eval for anything else.

Site commands

Turn any website into a CLI command. Site commands are TypeScript plugins that encapsulate navigation, scraping, and data extraction into reusable commands. Instead of telling your agent to navigate, snapshot, and parse — run one command and get structured JSON back.

# Get GitHub trending repos as structured data runbrowser github trending --limit 5 # RANK NAME STARS LANGUAGE # --- ---- ----- -------- # 1 denoland/deno 5.2k Rust # 2 tauri-apps/tauri 3.8k Rust # 3 nickel-org/nickel 2.1k Rust # JSON output for agents runbrowser github trending --limit 3 --json

Write your own by dropping a .ts file into ~/.runbrowser/commands/. Full TypeScript, full IDE support, no YAML wrappers.

// ~/.runbrowser/commands/github/trending.ts export const description = 'GitHub trending repositories' export const columns = ['rank', 'name', 'stars', 'language'] export const args = { limit: { type: 'number', default: 20, description: 'Number of items' }, language: { type: 'string', description: 'Filter by language' }, } export async function run(ctx, args) { await ctx.navigate('https://github.com/trending') const data = await ctx.evaluate(` [...document.querySelectorAll('article.Box-row')].map(el => ({ name: el.querySelector('h2 a')?.textContent?.trim(), stars: el.querySelector('.octicon-star')?.parentElement?.textContent?.trim(), language: el.querySelector('[itemprop="programmingLanguage"]')?.textContent?.trim(), })) `) return data.slice(0, args.limit).map((item, i) => ({ rank: i + 1, ...item })) }

Commands are loaded by the relay server via jiti — the CLI is just an HTTP client. Output supports --json, --csv, table, markdown, and YAML formats.

TypeScript plugins. Structured data. One command instead of navigate → snapshot → parse.

Command extensions

Install community-maintained commands from the runbrowser/commands repo. No cloning, no build steps — just install and use.

# List available command extensions runbrowser commands list # Available command extensions: # # reddit # youtube # x ✓ installed # hackernews # producthunt # Install an extension runbrowser commands install reddit # ✓ Installed reddit/ # → ~/.runbrowser/commands/reddit/hot.ts # → ~/.runbrowser/commands/reddit/search.ts # Use it immediately runbrowser reddit hot --limit 5 runbrowser reddit search "browser automation" # Uninstall runbrowser commands uninstall reddit # ✓ Uninstalled reddit/

Community commands are downloaded as TypeScript files into ~/.runbrowser/commands/<site>/. They follow the same format as custom site commands — you can read, modify, or fork them.

How it works: The CLI fetches .ts files from the runbrowser/commands GitHub repo and saves them locally. The relay server loads them via jiti at runtime — no compilation needed. Each extension is a directory with one or more command files.

Contribute your own: Create a directory in the runbrowser/commands repo with your .ts files and submit a PR. Your commands become available to everyone via runbrowser commands install.

Consistent subgroup pattern. Community-maintained. Same TypeScript format as custom commands.

Sessions

Sessions are auto-created — just run a command and it works. For advanced use, manage sessions explicitly.

runbrowser session new # create session, outputs id runbrowser session list # show sessions + state keys runbrowser session delete 1 # delete a session

Run multiple agents at once without them stepping on each other. Each session is an isolated sandbox with its own state. Browser tabs are shared, but session state is not.

# Explicit session targeting runbrowser navigate https://a.com -s 1 runbrowser navigate https://b.com -s 2 # Or set a default session export RUNBROWSER_SESSION=1 runbrowser snapshot # uses session 1

Screen recording

Have the agent record what it's doing as MP4 video. The recording uses chrome.tabCapture and runs in the extension context, so it survives page navigation.

# Start recording runbrowser record start -o recording.mp4 # Navigate, interact — recording continues across pages runbrowser navigate https://example.com runbrowser click @e5 # Stop and save runbrowser record stop # Recording saved: recording.mp4 # duration: 12.3s, size: 2.45 MB

For automated recording without clicking the extension icon, restart Chrome with special flags:

runbrowser config set profile "Profile 11" open -a "Google Chrome" --args --auto-accept-this-tab-capture --profile-directory="Profile 11"

Native tab capture. Survives navigation. Auto-transcodes to H.264 MP4.

MCP integration

RunBrowser uses a two-tool MCP modelskill and run. No tool sprawl, no schema bloat.

{ "mcpServers": { "runbrowser": { "command": "npx", "args": ["-y", "@jiweiyuan/runbrowser-mcp@latest"] } } }

The agent calls skill to learn available commands, then run to execute them:

skill() → returns full CLI documentation + site commands run({ command: "navigate https://example.com" }) run({ command: "snapshot" }) run({ command: "click @e1" }) run({ command: "eval document.title" }) run({ command: "github trending --limit 5" })

The run tool follows the same syntax as the CLI. The agent doesn't need to learn a separate API.

┌──────────────────────────────────────────────────────────────────────┐ │ MCP Client │ │ (Claude, Cursor, Windsurf...) │ └──────────────┬──────────────────────────────┬────────────────────────┘ │ │ skill() run(command) │ │ v v ┌──────────────────────────────────────────────────────────────────────┐ │ RunBrowser MCP Server │ │ │ │ ┌─────────────────┐ ┌──────────────────────────┐ │ │ │ skill tool │ │ run tool │ │ │ │ │ │ │ │ │ │ Returns CLI │ │ Parses command string │ │ │ │ docs + site │ │ Dispatches to relay │ │ │ │ command list │ │ Returns result │ │ │ └─────────────────┘ └────────────┬─────────────┘ │ │ │ │ └─────────────────────────────────────────────────┼───────────────────┘ │ HTTP / WS │ v ┌─────────────────────────┐ │ Relay :19988 │ │ CDPExecutor │ └────────────┬────────────┘ │ chrome.debugger │ v ┌─────────────────────────┐ │ Your Chrome Browser │ └─────────────────────────┘

Comparison

Why use this over the alternatives.

vs Playwright MCP
Playwright MCPRunBrowser
BrowserSpawns new ChromeUses your Chrome
ExtensionsNoneYour existing ones
Login stateFreshAlready logged in
Bot detectionAlways detectedCan bypass
CollaborationSeparate windowSame browser as user
vs BrowserUse
BrowserUseRunBrowser
LanguagePythonTypeScript / Node.js
BrowserSpawns new Chrome (Playwright)Uses your Chrome
ApproachAI agent framework (LLM decides)CLI + MCP tools (agent sends commands)
Login stateFreshAlready logged in
CLINoYes — 50+ commands
vs Agent Browser
Agent BrowserRunBrowser
BrowserSpawns headless ChromiumYour running Chrome
Commands~90 (bloated)~50 (focused)
Command extensionsNoYes — install community commands
CLI binaryRust CLI → Node daemonNode CLI → relay (simpler)
Bot detectionAlways detectedCan bypass
Real tabsNoYes

Remote access

Control Chrome on a remote machine — a headless Mac mini, a cloud VM, a devcontainer. Start the relay server with a public bind and a token.

# On the host machine — start relay server runbrowser serve --host 0.0.0.0 --token <secret> # From anywhere — set env vars and use normally export RUNBROWSER_HOST=192.168.1.10 export RUNBROWSER_TOKEN=<secret> runbrowser navigate https://example.com

Also works for devcontainers and Docker — use RUNBROWSER_HOST=host.docker.internal. Works for MCP too — set RUNBROWSER_HOST and RUNBROWSER_TOKEN in your MCP client env config.

Security

Everything runs on your machine. The relay binds to localhost:19988 and only accepts connections from the extension. No remote server, no account, no telemetry.

  • Local only — WebSocket server binds to localhost. Nothing leaves your machine.
  • Origin validation — only the RunBrowser extension origin is accepted. Browsers cannot spoof the Origin header, so malicious websites cannot connect.
  • Explicit consent — only tabs where you clicked the extension icon are controlled. No background access.
  • Visible automation — Chrome shows an automation banner on controlled tabs. Every agent action gets visual feedback (green highlight).
RunBrowser - Chrome extension & CLI that lets agents use your real browser