Obscura: The open-source headless browser for AI agents and web scraping
Obscura: The Open-Source Headless Browser for AI Agents and Web Scraping
![]()
Obscura is more than just a headless browser—it is a purpose-built engine crafted for automation at scale. Written in Rust, it runs real JavaScript through V8, speaks the Chrome DevTools Protocol, and serves as a drop-in replacement for headless Chrome via Puppeteer and Playwright. It’s lightweight, stealthy, and designed to power AI agents and robust web-scraping workflows without the typical overhead of traditional browsers.
As an open-source project under the Apache-2.0 license, Obscura embraces a philosophy of no feature gating and no hidden constraints. It’s the engine you can rely on, whether you’re building data pipelines, automating extraction tasks, or prototyping AI agents that need to interact with the web in realistic ways.
Why Obscura: A browser engine built for automation at scale
Obscura targets automation and data collection rather than desktop-like browsing. It focuses on speed, footprint, and reliability in headless environments. Here are the core reasons developers choose Obscura over traditional headless Chrome:
- Lightweight memory footprint: Obscura operates with around 30 MB of memory, compared with well over 200 MB for headless Chrome. This difference matters when you run thousands of concurrent sessions.
- Small binary size: A lean ~70 MB binary keeps deployment lightweight, while Chrome tends to exceed 300 MB.
- Built-in anti-detect capabilities: Obscura ships with anti-detection features, making automation more resilient in environments that try to identify automated tooling.
- Fast page loads: On many pages, Obscura achieves sub-100 ms loading times, enabling higher throughput for scraping pipelines.
- Instant startup: The engine starts quickly, reducing warm-up delays and enabling faster task orchestration in large-scale scrapes.
- Puppeteer and Playwright compatibility: It’s designed to be a drop-in replacement for headless Chrome, so you can reuse existing scripts and workflows with familiar tooling.
In short, Obscura is optimized for automation at scale, not personal browsing sessions. It’s the engine you deploy to run large-scale crawls, AI-driven interactions, and complex automation tasks reliably and efficiently.
What’s next: Obscura Cloud and a continued open-source vision
The project has hit a notable milestone—thousands of stars mark growing community interest and adoption. Looking ahead, the team is building Obscura Cloud, a hosted version of the engine with managed infrastructure, residential proxies, and dedicated support. This is aimed at teams who want the engine’s power without managing the infrastructure themselves. The core open-source engine remains Apache-2.0 licensed, with no gating of features for users who prefer to self-host.
If you’re curious about the hosted option, you can join the waitlist and be among the first to get access when it launches: Get on the waitlist → https://tally.so/r/gDWzdD
Install: Getting Obscura onto your machine
Obscura ships as prebuilt binaries for Linux, macOS, and Windows, and it can also be built from source. No Chrome, no Node.js, and no other dependencies are required to run the engine.
Download the latest binary from Releases:
Linux x86_64
macOS (Apple Silicon and Intel)
Windows (zip archive)
Release archives include both obscura and obscura-worker binaries. For parallel scraping, keep them in the same directory.
Linux releases target Ubuntu 22.04 (glibc 2.35+) to ensure broad compatibility on common LTS servers.
Code snippets for grabbing the binaries (examples shown are representative; adapt URLs as needed):
- Linux x86_64
# Linux x86_64
curl -LO https://github.com/h4ckf0r0day/obscura/releases/latest/download/obscura-x86_64-linux.tar.gz
tar xzf obscura-x86_64-linux.tar.gz
./obscura fetch https://example.com --eval "document.title"
- macOS Apple Silicon
curl -LO https://github.com/h4ckf0r0day/obscura/releases/latest/download/obscura-aarch64-macos.tar.gz
tar xzf obscura-aarch64-macos.tar.gz
- macOS Intel
curl -LO https://github.com/h4ckf0r0day/obscura/releases/latest/download/obscura-x86_64-macos.tar.gz
tar xzf obscura-x86_64-macos.tar.gz
- Windows
- Download the .zip from the releases page and extract it manually. The archive contains both obscura and obscura-worker.
Important notes:
- No Chrome, no Node.js, no dependencies needed to run.
- Linux release builds target Ubuntu 22.04 to work smoothly on common LTS servers.
Build from source (for those who want to customize or contribute):
git clone https://github.com/h4ckf0r0day/obscura.git
cd obscura
cargo build --release
# With stealth mode (anti-detection + tracker blocking)
cargo build --release --features stealth
Prerequisites:
- Rust 1.75+ (recommended via rustup)
- First build takes about five minutes since V8 will compile from source and cache for subsequent builds.
Quick Start: First steps with Obscura
Here’s a practical path to starting with Obscura and getting familiar with its workflow:
- Fetch a page and extract the title
obscura fetch https://example.com --eval "document.title"
- Extract all links
obscura fetch https://example.com --dump links
- Render JavaScript and dump the resulting HTML
obscura fetch https://news.ycombinator.com --dump html
- Wait for dynamic content to load
obscura fetch https://example.com --wait-until networkidle0
- Bound navigation time for slow or broken pages
obscura fetch https://example.com --timeout 10
Starting a CDP server to drive automation:
obscura serve --port 9222
# With stealth mode (anti-detection + tracker blocking)
obscura serve --port 9222 --stealth
Scraping multiple URLs in parallel:
obscura scrape url1 url2 url3 ... \
--concurrency 25 \
--eval "document.querySelector('h1').textContent" \
--format json
This workflow demonstrates how Obscura fits into typical scraping pipelines and automation tasks: you launch a server, connect via the Chrome DevTools Protocol, and drive multiple pages in parallel with scripting and evaluation hooks.
Puppeteer and Playwright: Using Obscura as a drop-in engine
Obscura is designed to be compatible with existing Puppeteer and Playwright workflows, enabling you to reuse familiar code paths while taking advantage of Obscura’s lean footprint and stealth capabilities.
- Puppeteer usage
- Install Puppeteer Core
npm install puppeteer-core - Connect to the Obscura CDP server
import puppeteer from 'puppeteer-core'; const browser = await puppeteer.connect({ browserWSEndpoint: 'ws://127.0.0.1:9222/devtools/browser' }); const page = await browser.newPage(); await page.goto('https://news.ycombinator.com'); const stories = await page.evaluate(() => Array.from(document.querySelectorAll('.titleline > a')) .map(a => ({ title: a.textContent, url: a.href }))); console.log(stories); await browser.disconnect(); - Playwright usage
- Install Playwright Core
npm install playwright-core - Connect to the Obscura endpoint
import { chromium } from 'playwright-core'; const browser = await chromium.connectOverCDP({ endpointURL: 'ws://127.0.0.1:9222' }); const page = await browser.newContext().then(ctx => ctx.newPage()); await page.goto('https://en.wikipedia.org/wiki/Web_scraping'); console.log(await page.title()); await browser.close();
Form submission and login examples:
- You can demonstrate POST handling and redirects with Obscura, including how it manages cookies and sessions:
await page.goto('https://quotes.toscrape.com/login');
await page.evaluate(() => {
document.querySelector('#username').value = 'admin';
document.querySelector('#password').value = 'admin';
document.querySelector('form').submit();
});
// Obscura handles the POST, follows the 302 redirect, maintains cookies
The key takeaway is that Obscura’s CDP compatibility keeps your existing tooling intact while delivering the engine’s performance and stealth advantages.
Benchmarks (in plain language, no tables)
Obscura is designed to deliver fast, predictable performance in automation-heavy scenarios. While exact figures vary by page and workload, a few general observations stand out:
- Static HTML delivery is notably quick with Obscura, often in the tens of milliseconds range on the right hardware.
- When JavaScript execution, XHR, and fetch calls are involved, Obscura remains competitive, enabling rapid data collection across many pages.
- Dynamic script-heavy pages, which can slow standard browsers, tend to resolve more rapidly under Obscura’s optimized pipeline, helping you keep throughput high in scraping tasks.
These numbers reflect the engine’s emphasis on speed and efficiency, making it attractive for large-scale crawls and AI-driven web interactions where throughput matters as much as correctness.
Stealth mode: Anti-fingerprinting and tracker blocking
Stealth mode is a key feature that helps automation blend more naturally into typical user traffic, reducing detectability and improving stability of automated runs.
Anti-fingerprinting features:
Per-session fingerprint randomization for GPU, screen, canvas, audio, and battery attributes
Realistic navigator.userAgentData values (Chrome 145, with high-entropy data)
Event.isTrusted = true for dispatched events to mirror real user interactions
Hidden internal properties (Object.keys(window)) kept safe to avoid leakage
Native function masking, so Function.prototype.toString() returns [native code]
navigator.webdriver is undefined, aligning with real Chrome behavior
Tracker blocking:
Blocks thousands of known domains (3,520 domains in the current dataset)
Prevents analytics, ads, telemetry, and fingerprinting scripts from loading
Stops trackers from loading altogether, reducing noise and potential blocking behavior during automation
Stealth mode can be enabled with a simple flag: --stealth
These features help you run automated tasks with fewer anti-bot mitigations triggering and with cleaner data streams.
CDP API: A familiar control surface for Puppeteer/Playwright users
Obscura implements the Chrome DevTools Protocol (CDP), making it easy to connect with Puppeteer and Playwright in a familiar way. This compatibility covers several domains and methods:
- Target domain: creating and closing targets, attaching to targets, creating browser contexts, and disposing of contexts
- Page domain: navigation, frame trees, injecting scripts, and monitoring lifecycle events
- Runtime domain: evaluating code, calling functions on objects, and obtaining properties
- DOM domain: querying and retrieving DOM nodes and HTML
- Network domain: enabling network tracking, cookies management, and request manipulation
- Fetch domain: live interception for enabling, continuing, fulfilling, or failing requests
- Storage domain: cookie management
- Input domain: simulating mouse and keyboard events
- LP domain: Markdown conversion from DOM to Markdown
This CDP compatibility means you can port existing Puppeteer/Playwright scripts to Obscura with minimal changes, taking advantage of the engine’s smaller footprint and stealth capabilities while retaining familiar APIs and workflows.
CLI Reference: Core commands and options (high-level overview)
Obscura provides a compact command-line interface that mirrors the familiar concepts of Puppeteer/Playwright workflows, but tuned for automation at scale. Here are the essential commands and their primary options:
obscura serve
Starts a CDP WebSocket server that client tools can connect to
Key flags:
- --port: WebSocket port (default 9222)
- --proxy: optional HTTP/SOCKS5 proxy URL
- --stealth: enable stealth mode (anti-detection + tracker blocking)
- --workers: number of parallel worker processes
- --obey-robots: respect robots.txt when crawling (off by default)
obscura fetch
Fetch and render a single page
Key options:
- --dump: output format (html, text, or links)
- --eval: JavaScript expression to evaluate
- --wait-until: navigation wait condition (load, domcontentloaded, networkidle0)
- --timeout: maximum navigation time in seconds
- --selector: wait for a CSS selector to appear
- --stealth: enable anti-detection mode
- --quiet: suppress the banner
obscura scrape
… Scrape multiple URLs in parallel with worker processes
Key options:
- --concurrency: number of parallel workers (default 10)
- --eval: JavaScript expression to run per page
- --format: output format (json or text)
These commands offer a straightforward way to perform page evaluation, scraping, and automated testing with a scalable, scriptable interface.
Form submission, login, and session handling
A practical scenario for Obscura involves submitting forms, handling logins, and maintaining session state across redirects. Obscura’s CDP-based engine handles POST requests, follows redirects, and preserves cookies, enabling automation to simulate real user sessions accurately. Whether you’re testing a login flow, performing data extraction behind a login, or navigating a funnel that requires authentication, Obscura provides a robust foundation for maintaining state and reproducing realistic user journeys.
Licensing and project philosophy
Obscura is released under the Apache-2.0 license. The project emphasizes an open, permissive approach to feature development, with no gating of capabilities in the open-source engine. If you rely on the engine for critical automation tasks, you can trust that the core is stable, battle-tested, and transparent. The roadmap includes expanding cloud-hosted offerings (Obscura Cloud) while preserving the open-source engine for self-hosted deployments.
Final thoughts: Why you should consider Obscura
- If your work hinges on automation at scale, Obscura’s low memory footprint, compact binary size, and fast startup give you a meaningful throughput advantage.
- If you need realistic automation that resists detection and minimizes data noise from trackers, the stealth features provide practical benefits in production environments.
- If you already rely on Puppeteer or Playwright, Obscura offers a smooth path to migration, letting you leverage your existing scripts with a more efficient engine.
- If you want a robust ecosystem for AI agents that interact with the web, Obscura’s architecture is designed to support these use cases with reliable, low-overhead tooling.
The project’s vision remains rooted in openness and performance, with a clear path toward hosted infrastructure through Obscura Cloud, should you prefer managed services and dedicated support. Whether you’re scraping data, training AI agents, or automating complex web interactions, Obscura aims to be the dependable backbone of your automation toolkit.
For updates, community engagement, and early access to the cloud offering, consider joining the waitlist: Get on the waitlist → https://tally.so/r/gDWzdD
Apache-2.0 licensed. Obscura continues to evolve as a lean, fast, and stealthy headless browser engine designed for AI agents and scalable web scraping.
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/h4ckf0r0day/obscura
GitHub - h4ckf0r0day/obscura: Obscura: The open-source headless browser for AI agents and web scraping
Obscura is an open-source headless browser that serves as a purpose-built engine for automation at scale. Written in Rust, it runs real JavaScript through V8, s...
github - h4ckf0r0day/obscura