A Chrome extension and CLI that let your agents control your actual browser — with logins, extensions, and cookies already there. No headless instance, no bot detection, no extra memory. Star on GitHub.
Other browser MCPs spawn a fresh Chrome — no logins, no extensions, instantly flagged by bot detectors, double the memory. RunBrowser connects to your running browser instead. One Chrome extension, full CDP access, everything you're already logged into.
Getting started
Three steps and your agent is browsing.
- Install the CLI:
npm i -g @jiweiyuan/runbrowser
-
Load the extension in Chrome: open
chrome://extensions/, enable Developer mode, click Load unpacked, select thepackages/extension/distfolder -
Click the extension icon on a tab — it turns green. Start using it:
runbrowser navigate https://example.com runbrowser snapshot runbrowser click @e5
No session management needed — sessions are auto-created on first command. The extension connects your browser to a local WebSocket relay on localhost:19988. The CLI sends CDP commands through the relay. No remote servers, no accounts, nothing leaves your machine.
Then install the skill — it teaches your agent how to use RunBrowser:
npx -y skills add runbrowser/runbrowser
Extension icon green = connected. Gray = not attached to this tab.
How it works
Click the extension icon on a tab — it attaches via chrome.debugger and opens a WebSocket to a local relay. Your agent (CLI or MCP) connects to the same relay. CDP commands flow directly through the Extension to Chrome — no Playwright, no middleman.
┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────┐ │ BROWSER │ │ LOCALHOST │ │ CLIENT │ │ │ │ │ │ │ │ ┌───────────────┐ │ │ WebSocket Server │ │ ┌───────────┐ │ │ │ Extension │<───────┬───> :19988 │ │ │ CLI / MCP │ │ │ └───────┬───────┘ │ WS │ │ │ └───────────┘ │ │ │ │ │ /extension │ │ │ │ │ chrome.debugger │ │ │ │ │ v │ │ v │ │ v │ │ ┌────────────┐ │ │ ┌───────────────┐ │ │ CDPExecutor │ │ │ HTTP API │ │ │ │ Tab 1 (green) │ │ │ (direct CDP) │ │ │ /api/* │ │ │ │ Tab 2 (green) │ │ └──────────────────────┘ │ └────────────┘ │ │ │ Tab 3 (gray) │ │ │ │ └─────────────────────┘ Tab 3 not controlled └─────────────────┘
The relay multiplexes sessions, so multiple agents or CLI instances can work with the same browser at the same time. All browser commands are executed directly via CDP — no Playwright dependency.
Collaboration
Because the agent works in your browser, you can collaborate. You see everything it does in real time — every click gets a green highlight on the target element, every command flashes a subtle border glow. When it hits a captcha, you solve it. When a consent wall appears, you click through it. When the agent gets stuck, you disable the extension on that tab, fix things manually, re-enable it, and the agent picks up where it left off.
You're not watching a remote screen or reading logs after the fact. You're sharing a browser — the agent does the repetitive work, you step in when it needs a human.
Accessibility snapshots
Your agent needs to see the page before it can act. Accessibility snapshots return every interactive element as text, with @ref labels attached. 5–20KB instead of 100KB+ for a screenshot — cheaper, faster, and the agent can parse them without vision.
runbrowser snapshot # Output: # - banner: # - link "Home" @e1 # - navigation: # - link "Docs" @e2 # - link "Blog" @e3 # - main: # - heading "Welcome" @e4 # - button "Get started" @e5
Each line ends with a @ref you can pass directly to commands. Use snapshots as the primary way to read pages. Only reach for screenshots when spatial layout matters.
# Interact using refs from snapshot runbrowser click @e5 runbrowser fill @e3 "search term" runbrowser get text @e4 # Filter to interactive elements only runbrowser snapshot -i # Scope to a CSS selector runbrowser snapshot -S "main"
CLI commands
RunBrowser provides 50+ commands organized by category. Every command supports --json output and auto-session creation.
# Navigation runbrowser navigate https://example.com runbrowser back runbrowser forward runbrowser reload runbrowser close # close session # Observation runbrowser snapshot # accessibility tree with @refs runbrowser snapshot -i # interactive elements only runbrowser screenshot shot.png # save screenshot runbrowser get url # current URL runbrowser get title # page title runbrowser get text @e5 # element text content runbrowser get html @e5 # element HTML runbrowser get attr @e5 --attr-name href # attribute value runbrowser get count @e5 # count matching elements runbrowser is visible @e5 # check element state runbrowser is checked @e5 runbrowser is enabled @e5 # Interaction runbrowser click @e5 # click element runbrowser dblclick @e5 # double-click runbrowser fill @e3 "hello world" # clear + fill input runbrowser type "search query" # type at current focus runbrowser press Enter # press key runbrowser select @e5 "option-value" # select dropdown option runbrowser check @e5 # check checkbox runbrowser uncheck @e5 # uncheck checkbox runbrowser scroll down # scroll direction runbrowser scroll down 500 # scroll by pixels runbrowser hover @e5 # hover element runbrowser focus @e5 # focus element runbrowser upload @e5 ./file.png # upload files runbrowser drag @e1 @e2 # drag source to target runbrowser viewport 1280 720 # set viewport size # Wait conditions runbrowser wait @e5 # wait for element visible runbrowser wait 2000 # wait milliseconds runbrowser wait --text "Welcome" # wait for text runbrowser wait --url "**/dashboard" # wait for URL pattern runbrowser wait --load networkidle # wait for load state # Semantic locators runbrowser find role button click --name "Submit" runbrowser find text "Sign in" click runbrowser find label "Email" fill "user@example.com" # Tab & frame management runbrowser tab list # list all tabs runbrowser tab new https://example.com # open new tab runbrowser tab 2 # switch to tab index runbrowser frame "iframe#embed" # switch to iframe runbrowser frame main # return to main frame # Execution runbrowser eval 'document.title' # run JS in browser runbrowser cdp Page.captureScreenshot '{}' # raw CDP command
Flat commands for hot path. Subgroups for management. eval for anything else.
Site commands
Turn any website into a CLI command. Site commands are TypeScript plugins that encapsulate navigation, scraping, and data extraction into reusable commands. Instead of telling your agent to navigate, snapshot, and parse — run one command and get structured JSON back.
# Get GitHub trending repos as structured data runbrowser github trending --limit 5 # RANK NAME STARS LANGUAGE # --- ---- ----- -------- # 1 denoland/deno 5.2k Rust # 2 tauri-apps/tauri 3.8k Rust # 3 nickel-org/nickel 2.1k Rust # JSON output for agents runbrowser github trending --limit 3 --json
Write your own by dropping a .ts file into ~/.runbrowser/commands/. Full TypeScript, full IDE support, no YAML wrappers.
// ~/.runbrowser/commands/github/trending.ts export const description = 'GitHub trending repositories' export const columns = ['rank', 'name', 'stars', 'language'] export const args = { limit: { type: 'number', default: 20, description: 'Number of items' }, language: { type: 'string', description: 'Filter by language' }, } export async function run(ctx, args) { await ctx.navigate('https://github.com/trending') const data = await ctx.evaluate(` [...document.querySelectorAll('article.Box-row')].map(el => ({ name: el.querySelector('h2 a')?.textContent?.trim(), stars: el.querySelector('.octicon-star')?.parentElement?.textContent?.trim(), language: el.querySelector('[itemprop="programmingLanguage"]')?.textContent?.trim(), })) `) return data.slice(0, args.limit).map((item, i) => ({ rank: i + 1, ...item })) }
Commands are loaded by the relay server via jiti — the CLI is just an HTTP client. Output supports --json, --csv, table, markdown, and YAML formats.
TypeScript plugins. Structured data. One command instead of navigate → snapshot → parse.
Command extensions
Install community-maintained commands from the runbrowser/commands repo. No cloning, no build steps — just install and use.
# List available command extensions runbrowser commands list # Available command extensions: # # reddit # youtube # x ✓ installed # hackernews # producthunt # Install an extension runbrowser commands install reddit # ✓ Installed reddit/ # → ~/.runbrowser/commands/reddit/hot.ts # → ~/.runbrowser/commands/reddit/search.ts # Use it immediately runbrowser reddit hot --limit 5 runbrowser reddit search "browser automation" # Uninstall runbrowser commands uninstall reddit # ✓ Uninstalled reddit/
Community commands are downloaded as TypeScript files into ~/.runbrowser/commands/<site>/. They follow the same format as custom site commands — you can read, modify, or fork them.
How it works: The CLI fetches .ts files from the runbrowser/commands GitHub repo and saves them locally. The relay server loads them via jiti at runtime — no compilation needed. Each extension is a directory with one or more command files.
Contribute your own: Create a directory in the runbrowser/commands repo with your .ts files and submit a PR. Your commands become available to everyone via runbrowser commands install.
Consistent subgroup pattern. Community-maintained. Same TypeScript format as custom commands.
Sessions
Sessions are auto-created — just run a command and it works. For advanced use, manage sessions explicitly.
runbrowser session new # create session, outputs id runbrowser session list # show sessions + state keys runbrowser session delete 1 # delete a session
Run multiple agents at once without them stepping on each other. Each session is an isolated sandbox with its own state. Browser tabs are shared, but session state is not.
# Explicit session targeting runbrowser navigate https://a.com -s 1 runbrowser navigate https://b.com -s 2 # Or set a default session export RUNBROWSER_SESSION=1 runbrowser snapshot # uses session 1
Screen recording
Have the agent record what it's doing as MP4 video. The recording uses chrome.tabCapture and runs in the extension context, so it survives page navigation.
# Start recording runbrowser record start -o recording.mp4 # Navigate, interact — recording continues across pages runbrowser navigate https://example.com runbrowser click @e5 # Stop and save runbrowser record stop # Recording saved: recording.mp4 # duration: 12.3s, size: 2.45 MB
For automated recording without clicking the extension icon, restart Chrome with special flags:
runbrowser config set profile "Profile 11" open -a "Google Chrome" --args --auto-accept-this-tab-capture --profile-directory="Profile 11"
Native tab capture. Survives navigation. Auto-transcodes to H.264 MP4.
MCP integration
RunBrowser uses a two-tool MCP model — skill and run. No tool sprawl, no schema bloat.
{ "mcpServers": { "runbrowser": { "command": "npx", "args": ["-y", "@jiweiyuan/runbrowser-mcp@latest"] } } }
The agent calls skill to learn available commands, then run to execute them:
skill() → returns full CLI documentation + site commands
run({ command: "navigate https://example.com" })
run({ command: "snapshot" })
run({ command: "click @e1" })
run({ command: "eval document.title" })
run({ command: "github trending --limit 5" })
The run tool follows the same syntax as the CLI. The agent doesn't need to learn a separate API.
┌──────────────────────────────────────────────────────────────────────┐ │ MCP Client │ │ (Claude, Cursor, Windsurf...) │ └──────────────┬──────────────────────────────┬────────────────────────┘ │ │ skill() run(command) │ │ v v ┌──────────────────────────────────────────────────────────────────────┐ │ RunBrowser MCP Server │ │ │ │ ┌─────────────────┐ ┌──────────────────────────┐ │ │ │ skill tool │ │ run tool │ │ │ │ │ │ │ │ │ │ Returns CLI │ │ Parses command string │ │ │ │ docs + site │ │ Dispatches to relay │ │ │ │ command list │ │ Returns result │ │ │ └─────────────────┘ └────────────┬─────────────┘ │ │ │ │ └─────────────────────────────────────────────────┼───────────────────┘ │ HTTP / WS │ v ┌─────────────────────────┐ │ Relay :19988 │ │ CDPExecutor │ └────────────┬────────────┘ │ chrome.debugger │ v ┌─────────────────────────┐ │ Your Chrome Browser │ └─────────────────────────┘
Comparison
Why use this over the alternatives.
| Playwright MCP | RunBrowser | |
|---|---|---|
| Browser | Spawns new Chrome | Uses your Chrome |
| Extensions | None | Your existing ones |
| Login state | Fresh | Already logged in |
| Bot detection | Always detected | Can bypass |
| Collaboration | Separate window | Same browser as user |
| BrowserUse | RunBrowser | |
|---|---|---|
| Language | Python | TypeScript / Node.js |
| Browser | Spawns new Chrome (Playwright) | Uses your Chrome |
| Approach | AI agent framework (LLM decides) | CLI + MCP tools (agent sends commands) |
| Login state | Fresh | Already logged in |
| CLI | No | Yes — 50+ commands |
| Agent Browser | RunBrowser | |
|---|---|---|
| Browser | Spawns headless Chromium | Your running Chrome |
| Commands | ~90 (bloated) | ~50 (focused) |
| Command extensions | No | Yes — install community commands |
| CLI binary | Rust CLI → Node daemon | Node CLI → relay (simpler) |
| Bot detection | Always detected | Can bypass |
| Real tabs | No | Yes |
Remote access
Control Chrome on a remote machine — a headless Mac mini, a cloud VM, a devcontainer. Start the relay server with a public bind and a token.
# On the host machine — start relay server runbrowser serve --host 0.0.0.0 --token <secret> # From anywhere — set env vars and use normally export RUNBROWSER_HOST=192.168.1.10 export RUNBROWSER_TOKEN=<secret> runbrowser navigate https://example.com
Also works for devcontainers and Docker — use RUNBROWSER_HOST=host.docker.internal. Works for MCP too — set RUNBROWSER_HOST and RUNBROWSER_TOKEN in your MCP client env config.
Security
Everything runs on your machine. The relay binds to localhost:19988 and only accepts connections from the extension. No remote server, no account, no telemetry.
- Local only — WebSocket server binds to localhost. Nothing leaves your machine.
- Origin validation — only the RunBrowser extension origin is accepted. Browsers cannot spoof the Origin header, so malicious websites cannot connect.
- Explicit consent — only tabs where you clicked the extension icon are controlled. No background access.
- Visible automation — Chrome shows an automation banner on controlled tabs. Every agent action gets visual feedback (green highlight).