Browser Harness

Browser Harness: Direct, Unfiltered Access to a Real Browser for LLMs

Overview Browser Harness is a thin, editable browser control layer built on top of the Chrome DevTools Protocol (CDP). It is designed for tasks that demand complete freedom and unmediated access to a real browser. With a single websocket connection to Chrome and nothing else between the agent and the browser, the agent writes what’s missing during execution, and the harness continually improves after every run. In practice, this means you can deploy an agent that learns as it goes, refining skills, workflows, and edge-case handling on-the-fly.

This approach stands in contrast to traditional browser automation stacks that abstract away or constrain browser behavior. Here, the agent can inspect, click, type, navigate, and interact with every element in a live browser session, while the harness acts as the minimal and editable conduit that exposes CDP capabilities to the agent. The result is a powerful, self-improving system that grows smarter with each task.

A Snapshot of the Experience

Real-time browser control: One websocket to Chrome, with no middleman.
Self-enhancing execution: The harness learns from each run and adds what’s missing.
Transparent collaboration: The agent suggests, the harness implements, and the workflow becomes smoother over time.
Progressive autonomy: The agent can upload missing helpers, create custom scripts, and refine domain skills as it works.

A Concrete Illustration During a typical task, the agent may encounter a missing helper or a delicate edge case in a workflow. The harness records the interaction, and then the agent writes what’s needed. For example:

● agent: wants to upload a file
│ ● agent-workspace/agent_helpers.py → helper missing
│ ● agent writes it agent_helpers.py
+ custom helper ✓ file uploaded

This little sequence captures the essence of the system: the agent identifies a gap, writes the necessary helper code, and the result is a fuller, more capable runtime environment. Over time, the agent develops its own set of domain skills—reusable patterns for specific sites or tasks—that reduce the need to rediscover flows and selectors with each run.

Getting Started: Setup, Prompts, and Quick Wins Setup Prompt A practical starting point is a setup prompt you can paste into Claude Code or Codex:

Set up https://github.com/browser-use/browser-harness for me. Read install.md and follow the steps to install browser-harness and connect it to my browser.

The setup process nudges the agent toward establishing a remote-debugging connection within the browser. The agent will open chrome://inspect/#remote-debugging and guide you through enabling remote debugging. To connect securely and effectively, you’ll need to perform a couple of steps in the browser UI:

Tick the "Remote debugging" checkbox to permit the agent to connect to your browser.
When the per-attach popup appears (Chrome 144+), click Allow to authorize the connection.

Visual cues in the setup process can help you confirm success. The provided banners and screenshots illustrate the prompts you’ll encounter:

Remote debugging setup: an image shows where to enable remote debugging.
Allow remote debugging popup: a follow-up confirmation prompt to approve the connection.

If you want to explore example tasks right away, check the agent workspace’s domain skills directory for ready-made flows. These samples demonstrate how to structure domain-specific automation and give you a sense of how the agent encodes site interactions, selectors, and edge cases for repeatable use.

Free Browser Use: Cloud Browsers and Costs Browser Harness offers a cloud-based option called Free Browser Use, which includes:

Cloud browsers for testing and automation in a managed environment.
Sub-agents and headless deployment capabilities for scalable workflows.
A free tier with practical limits to get you started without financial friction.

Key benefits of the cloud option include access to multiple concurrent sessions, proxies, and captcha-solving capabilities—features that are often essential for robust automation or testing workloads. You can obtain a cloud API key from:

https://cloud.browser-use.com/new-api-key

Alternatively, you can let the agent sign up itself via:

https://docs.browser-use.com/llms.txt (setup flow + challenge context included)

This dual-path approach makes it easy to begin without friction, while also providing a documented signup flow for longer-term use.

Architecture: What Makes It Tick The architecture is designed to be approachable yet powerful, with a clear separation between setup, everyday usage, and the code the agent can edit or extend. There are four core areas, totaling roughly a thousand lines of code across the main files:

install.md: First-time install and browser bootstrap. This file guides you through environment preparation, dependency installation, and the initial connection to a real browser.
SKILL.md: Day-to-day usage. This document outlines how to invoke tasks, manage sessions, and interpret results produced by the agent as it interacts with the browser.
src/browser_harness/: Protected core package. This is the heart of the harness, exposing the CDP interface to the agent while enforcing safety and consistency.
agent-workspace/agent_helpers.py: Helper code the agent edits. This is the editable layer where the agent writes new utilities, fixes gaps, and extends capabilities during execution.
agent-workspace/domain-skills/: Reusable site-specific skills the agent edits. These are domain-specific playbooks that encode knowledge about particular sites or tasks.

A closer look at domain skills Domain skills are a core concept designed to enable community-driven, per-site automation patterns. By enabling BHDOMAINSKILLS=1, you turn on community-contributed per-site playbooks that the agent can surface on a given domain. The per-domain playbooks, surfaced via a goto_url command, help the agent navigate to a target page with domain-aware context. If you contribute a new domain skill, you’ll place it under:

agent-workspace/domain-skills/

And it will become part of the agent’s repertoire for that site or task.

Contributing: How to Help the Project Grow Contributions and improvements are welcome. The best way to help is to contribute a new domain skill under agent-workspace/domain-skills/ for a site or task you use often (examples include LinkedIn outreach, ordering on Amazon, filing expenses, etc.). Each skill teaches the agent the selectors, flows, and edge cases it would otherwise have to rediscover.

Important notes about skills:

Skills are written by the harness, not by you. You run your task with the agent, and when it discovers something non-obvious, it files the skill itself (see SKILL.md).
Avoid hand-authoring skill files; prefer letting the agent generate skills that reflect what actually works in the browser.
Open a PR with the generated agent-workspace/domain-skills// folder. Small, focused changes tend to work best.

A sample workflow for contributing

Create or update a domain skill for a site you frequently automate.
Run a real task to allow the agent to discover reliable flows and edge cases.
The agent generates the skill folder and files reflecting the successful approach.
Submit a PR with the new or updated skill, along with notes on how to test it.

Domain-skills enable practical scalability: over time, you accumulate a library of tested patterns that reduce the repetition of investigative steps across sessions and tasks.

Domain Skills: How to Enable and Use To enable domain skills, set BHDOMAINSKILLS=1. You’ll then get access to per-site playbooks surfaced by domain in skills like goto_url. For example, per-site skills can surface as:

LinkedIn outreach sequences
Amazon product workflows (search, compare, purchase steps)
Expense reporting flows
Other sites you frequently automate

As you contribute more skills, the agent learns more reliable heuristics for each domain, reducing the cycle time between task initiation and successful completion.

The Bitter Lesson, and Web Agents That Learn For those curious about the philosophical or practical underpinnings of agent learning in harnesses, two thought-provoking posts are worth a read:

The Bitter Lesson of Agent Harnesses
Web Agents That Actually Learn

These resources highlight the idea that the most powerful code is often the code that learns from its own interactions, rather than static, hand-authored rules. Browser Harness embodies that principle by letting the agent and harness co-evolve, with the agent proposing improvements and the harness incorporating them into future runs.

A Look at the Tools and Files You’ll Encounter

install.md: Step-by-step instructions for getting the browser harness up and running, including environment preparation and browser bootstrap.
SKILL.md: Practical guidelines for day-to-day usage, including how the agent should approach tasks, how to interpret results, and how to handle failures gracefully.
src/browser_harness/: The protected core package that enables safe, consistent CDP interactions with the browser.
agent-workspace/agent_helpers.py: The editable layer the agent writes into as it discovers new utilities or fixes gaps.
agent-workspace/domain-skills/: A repository of reusable, domain-specific skills that can be extended or used as a starting point for new tasks.

Usage Patterns and Practical Scenarios

Automating professional workflows: The agent can navigate enterprise sites, fill out forms, collect data, and submit tasks with minimal manual intervention.
Real-time debugging and data collection: If a page changes, the agent can adapt on the fly, writing new helpers or updating domain skills to cope with the change.
Learning-based automation: Each run produces insights, and the harness records improvements for subsequent tasks, creating a form of cumulative intelligence.

Security and Best Practices

The harness provides a narrow, well-defined surface to the browser via CDP. It is important to maintain careful boundaries around what the agent is allowed to do, especially in sensitive environments or with restricted data.
Regularly review the agent-generated code in agent-workspace/agent_helpers.py and the domain skills to ensure there are no unintended actions or data exposures.
Use cloud-based deployments cautiously. While the cloud tier offers convenience and scalability, it also distributes the execution environment. Ensure you’re compliant with organizational policies and data governance.

An Example of How a Task Unfolds

You initiate a task: a user asks the agent to extract job postings from a site, fill out forms, and save results.
The agent navigates to the target page, detects a missing helper or failed interaction, and writes a new helper in agent-workspace/agent_helpers.py.
The harness reuses the newly created helper in subsequent iterations, refining the interaction pattern.
The agent develops a domain skill under domain-skills for that site, encoding the selectors, flows, and edge cases it encountered, so future tasks on this site can proceed more efficiently.
You update BHDOMAINSKILLS=1 to enable the newly created skill to be surfaced as a per-site playbook.

Documentation and Community Resources

The project maintains documentation for install steps and usage patterns (install.md and SKILL.md).
There are links to thought leadership on agent learning and harnesses:
The Bitter Lesson of Agent Harnesses
Web Agents That Actually Learn
For more ideas, examples, and community-contributed skills, explore the domain-skills directory in the repository and examine existing skills for platforms like LinkedIn, Amazon, and related domains.

Connecting and Exploring Further

If you want to experience the initial setup visually, the following images from the input provide helpful guidance:
Banner introduction: the main banner image at the top of this post
Remote debugging setup: a screenshot showing where to enable remote debugging
Allow remote debugging popup: a screenshot illustrating the per-attach permission prompt
You’ll find the actual images embedded in the original documentation as:
Banner: Banner-ink artwork illustrating Browser Harness
Remote debugging setup: docs/setup-remote-debugging.png
Allow remote debugging: docs/allow-remote-debugging.png

Closing Thoughts: Why Browser Harness Matters Browser Harness represents a shift toward more autonomous, learning-capable browser automation. By linking an agent directly to a live browser through a lean CDP harness, you remove unnecessary layers that can obscure behavior and limit adaptability. The agent’s capacity to identify gaps, generate new helpers, and contribute domain skills on the fly creates a self-improving loop that compounds over time. This approach can accelerate complex browser tasks, reduce manual tedium, and enable a form of practical intelligence that gets sharper with every run.

If you are exploring ways to build smarter automation, or you want to empower an LLM with direct browser control while maintaining the ability to improve itself through experience, Browser Harness offers a compelling framework. It balances openness and control: the agent handles strategy and task planning, while the harness provides a minimal, editable, and robust channel to the real browser. This arrangement makes it possible to tailor automation to real-world workflows and continually refine the system as your needs evolve.

Browser Harness

Enjoying this project?

GitHub - browser-use/browser-harness: Browser Harness

Stay Updated

Product

Learn

Company

Legal

Stay Updated

Browse by Category