Install
$ agentstack add mcp-mingye-lu-agenticcrawler Open-source listing — not yet scanned by AgentStack. Follow the source repository for install instructions.
About
█████╗ ██████╗██████╗ █████╗ ██╗ ██╗██╗ ██╔══██╗██╔════╝██╔══██╗██╔══██╗██║ ██║██║ ███████║██║ ██████╔╝███████║██║ █╗ ██║██║ ██╔══██║██║ ██╔══██╗██╔══██║██║███╗██║██║ ██║ ██║╚██████╗██║ ██║██║ ██║╚███╔███╔╝███████╗ ╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚══╝╚══╝ ╚══════╝
Browser automation agent that acts and observes. Navigate, click, and extract — or inspect network traffic, debug console errors, profile performance, and audit accessibility. One Rust binary.
Single Rust binary. Full DevTools observability. 42 tools. 25 LLM providers. MCP server built-in.
Why acrawl?
Most browser agents stop at navigate and click. acrawl goes further: the agent can also inspect network requests, analyze console errors, measure page performance, audit accessibility, and intercept network calls — the full DevTools surface, available as first-class agent tools.
It ships as a single Rust binary. No Python runtime, no Node runtime, no Docker. Drop it into any server or CI pipeline and describe a goal; the agent figures out what to visit, what to click, what to inspect, and when it's done.
- One binary, zero runtimes.
cargo build --releaseproduces a self-contained executable. No Python, no Node runtime — just Rust and a Chromium download for browser automation. - Acts and observes. The agent has the full DevTools surface as first-class tools: inspect network requests with timing, analyze and deduplicate console logs, stream WebSocket messages, measure page performance (TTFB, resource breakdown), audit cookies and browser storage, measure JS/CSS coverage, run axe-core WCAG accessibility audits, and intercept or mock network calls. No other agent framework exposes this.
- Deterministic where you can, AI where you must. Define loops, conditionals, and parallel branches as JSON scripts — executed without any LLM calls. Fall back to the agent when pages behave unexpectedly. Best of both worlds.
- No code required. Describe the goal in plain English. The agent plans, navigates, and extracts.
- Smart fetching. Static pages are served over HTTP (fast). When JavaScript or interaction is needed, acrawl detects JS framework markers (
__next_data__,__nuxt,__vue,ng-app, React roots), auth redirects, and short `` bodies — then transparently escalates to a headless browser. - Sub-agent parallelism. Fork child agents onto separate browser tabs with independent state and step budgets. A URL-claiming registry prevents siblings from crawling the same page twice.
- MCP client and server. Extend the agent with custom tools via Model Context Protocol servers. Or flip it:
acrawl mcpexposes 38 browser and DevTools tools plusrun_goalto Claude Code, Cursor, VS Code, Zed, and 13 other clients. - 25 LLM providers. Anthropic, OpenAI, Google Gemini, DeepSeek, AWS Bedrock, Azure OpenAI, Vertex AI, GitHub Copilot, Groq, Mistral, xAI, Cohere, Alibaba DashScope, OpenRouter, and more. Or bring your own via any OpenAI-compatible endpoint.
How does it compare?
vs. AI web agents and scraping tools
| | acrawl | browser-use | Stagehand | Skyvern | Firecrawl | Playwright MCP | Scrapy | Playwright scripts | |---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | No code needed | Yes | No | No | Partial | No | No | No | No | | Single binary | Yes | No | No | No | No | No | No | No | | JS rendering | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | | LLM-powered navigation | Yes | Yes | Yes | Yes | Limited | No | No | No | | No Python / Node needed | Yes | No | No | No | No | No | No | No | | Form filling / interaction | Yes | Yes | Yes | Yes | No | Yes | No | Yes | | Sub-agent parallelism | Yes | No | No | Partial | Partial | No | Partial | No | | 25 LLM providers | Yes | Via LiteLLM | Partial | Partial | N/A | N/A | N/A | N/A | | MCP client (use tools) | Yes | No | No | No | No | No | No | No | | MCP server (expose as tools) | Yes | No | No | No | Yes | Yes | No | No | | Stealth browser built-in | Yes | Cloud only | Via Browserbase | Cloud only | No | No | No | No | | DevTools observability (network, console, perf, a11y) | Yes | No | No | No | No | No | No | No | | Deterministic script layer (zero LLM calls) | Yes | No | Partial | No | No | No | Yes | Yes | | Open source | Yes | Yes (MIT) | Yes (MIT) | Yes (Apache) | Engine only | Yes (MIT) | Yes (BSD) | Yes (Apache) |
Notes:
- browser-use (85k+ GitHub stars): Python + Playwright, DOM + screenshots, supports GPT/Claude/Gemini/Ollama via LiteLLM, 89.1% WebVoyager. No single binary — requires Python and
pip install. Every action calls an LLM: 2-5s/step, ~$0.02-0.30/task. Cloud tier adds stealth; self-hosted is bare Playwright. - Stagehand (Browserbase, 21k+ stars): TypeScript + CDP (v3), mixes deterministic Playwright with AI primitives (
act(),extract(),observe()). Action caching reuses successful clicks without re-calling the LLM. Requires Node and, for production, Browserbase cloud hosting. - Skyvern (21k+ stars, Apache 2.0): vision-first (screenshot-only, no DOM), handles legacy portals and government forms that DOM tools struggle with. No-code cloud UI available. Each step costs vision-model tokens — ~$0.10-0.50/task. 85.85% WebVoyager.
- Firecrawl (82k+ stars): managed scraping API. Returns LLM-ready Markdown, JSON extraction, site-wide crawl. Not an agentic tool — minimal multi-step interaction. Ships an official MCP server. Per-page pricing from $19/month.
- Playwright MCP (Microsoft, 29k+ stars): MCP server that exposes browser control via the accessibility tree. Sub-100ms actions, zero vision tokens. Drives an LLM client's browser rather than having its own reasoning — no autonomous goal navigation. Used in GitHub Copilot Agent.
vs. native LLM provider browsing
Most AI providers offer some form of browsing, but it is designed for conversational information retrieval, not programmatic web automation. Key constraints:
| | acrawl | ChatGPT Agent | Claude Computer Use | Claude in Chrome | Gemini Deep Research | Copilot / Edge | |---|:---:|:---:|:---:|:---:|:---:|:---:| | Real JS-rendered browser | Yes | Yes (sandboxed cloud VM) | Indirect (dev provides env) | Yes (your Chrome) | No (search API only) | Limited (Bing retrieval) | | Click / fill forms | Yes | Yes (requires user confirmation) | Yes | Yes | No | Limited | | Programmable / scriptable | Yes | No | Yes (API beta) | No | No | No | | Sub-agent parallelism | Yes | No | No | No | No | No | | MCP server (expose as tools) | Yes | No | No | No | No | No | | Returns structured data | Yes | No (text summaries) | No (screenshots) | No | No | No | | Stealth / anti-bot | Yes | No | No | No | No | No | | No vendor lock-in | Yes (25 providers) | OpenAI only | Anthropic only | Anthropic only | Google only | OpenAI / Bing only | | Runs without paid subscription | Yes (OSS; LLM key needed) | No (Plus/Pro/Business) | No (API cost) | No (Max plan) | Partial | Yes (free tier) |
Notes:
- ChatGPT Agent (OpenAI, July 2025): runs in a sandboxed cloud virtual machine with its own Chromium instance. Can browse, click, and fill forms but pauses for user confirmation on sensitive actions (purchases, logins). Uses two modes: a fast text browser for research queries and a visual browser for interaction. Cannot run code in the browser, install extensions, or access your local file system. Susceptible to prompt injection. Available to Plus/Pro/Business subscribers.
- ChatGPT Atlas (OpenAI, October 2025): a full Chromium browser with ChatGPT integrated as a sidebar + agent. Agent mode drives the same sandboxed cloud VM as ChatGPT Agent; core limitations are identical.
- Claude Computer Use (Anthropic API, beta since October 2024): screenshot + mouse/keyboard API for any desktop application, not just browsers. Vision-only — no DOM access. Developers must provide and manage the entire computing environment (typically a Docker container with Xvfb + Firefox). Not a ready-to-use binary. Requires significant infrastructure to operate in production.
- Claude in Chrome (Anthropic Chrome extension, beta November 2025+): lets Claude operate within your existing Chrome session using your real cookies and logins. Available to Max plan subscribers. Not an open API — no programmatic control. Good for interactive personal tasks; not suitable for batch automation.
- Gemini / Deep Research (Google): browsing is grounded via Google Search API calls, not a live browser session. Deep Research synthesizes across many searches but cannot interact with pages (click, fill forms, navigate dynamically). Project Mariner (experimental computer use) is a separate, limited research preview.
- Copilot / Edge (Microsoft): Edge's Copilot Mode uses Bing retrieval with some ability to navigate pages. Real-world tests show high latency (6+ minutes for multi-page comparison tasks) and frequent interruptions for user confirmation. Not a developer API.
Quick Start
Install
Linux / macOS (x64 / ARM64):
curl -fsSL https://raw.githubusercontent.com/Mingye-Lu/AgenticCrawler/main/install.sh | bash
Windows (x64, PowerShell):
irm https://raw.githubusercontent.com/Mingye-Lu/AgenticCrawler/main/install.ps1 | iex
This downloads the latest binary, verifies its SHA256 checksum, and sets up CloakBrowser for stealth browser automation. Requires Node.js 20+ for browser features.
acrawl checks for updates on startup and shows a notification when a new version is available.
Build from source
git clone https://github.com/Mingye-Lu/AgenticCrawler.git
cd AgenticCrawler
cargo build --release
# Install CloakBrowser (required for browser automation — binary auto-downloads on first use)
npm install
Browser Extension (optional)
The acrawl Bridge extension lets acrawl control your real browser (with your sessions, cookies, and existing extensions) instead of a headless CloakBrowser instance. Download acrawl-extension.zip from the latest release, unzip it, then load it into your browser:
| Browser | Extensions page | Developer mode toggle | |---------|----------------|----------------------| | Chrome | chrome://extensions | Top-right | | Edge | edge://extensions | Bottom-left | | Brave | brave://extensions | Top-right | | Arc / Vivaldi / Opera | ://extensions | Varies |
Enable Developer mode, click Load unpacked, and select the unzipped folder. Then run /extension in the acrawl REPL to connect. See [extension/README.md](extension/README.md) for full setup details.
Configure
# Set up your LLM provider (interactive prompt)
./target/release/acrawl auth anthropic # or: openai, other
Credentials are stored in ~/.acrawl/credentials.json. Override the config directory with ACRAWL_CONFIG_HOME.
Run
# Interactive REPL
./target/release/acrawl
# One-shot mode
./target/release/acrawl prompt "scrape all book titles and prices from books.toscrape.com"
# Resume a saved session
./target/release/acrawl --resume session.json /status /compact
Examples
Scrape a product catalog:
acrawl > scrape all book titles, prices, and ratings from books.toscrape.com
The agent navigates to the site, reads the page, extracts the data, paginates through all 50 pages, and returns structured JSON.
Fill and submit a form:
acrawl > go to example.com/contact, fill in name "Jane Doe", email "jane@example.com",
message "Hello", and submit the form
The agent locates form fields, fills them in, clicks submit, and confirms the result.
Monitor a price:
acrawl > check the current price of "Rust in Action" on books.toscrape.com
Single-page extraction — the agent fetches, reads, and returns the price without unnecessary navigation.
Extract from JS-rendered pages:
acrawl > get all repository names and star counts from github.com/trending
Static HTTP won't work here. acrawl detects React/Next.js markers and automatically escalates to a headless browser to render the JavaScript.
Parallel multi-page crawl:
acrawl > scrape the title, author, and price of every book across all 50 pages on books.toscrape.com.
Fork a sub-agent for each page to speed this up.
The agent spawns up to 5 concurrent sub-agents, each on its own browser tab, to crawl pages in parallel. Results are merged when all sub-agents finish.
Features
42-Tool Toolbox
Navigation
| Tool | Description | |------|-------------| | navigate | Go to a URL (supports format: markdown/text/html/fit_markdown). Uses HTTP first, auto-escalates to browser when JS is detected. Returns structured content with a page_map. fit_markdown prunes boilerplate DOM nodes before conversion, saving tokens. | | go_back | Browser back button. Returns page_state with the resulting page structure. | | scroll | Scroll up or down by pixel amount (pixels, default: 500). Returns page_state after scrolling. | | switch_tab | Switch to a different browser tab by index. Returns page_state of the new tab. | | wait | Wait for a CSS selector to reach a given state (visible, hidden, attached, detached) or a fixed timeout (up to 300s). Returns page_state after the condition is met. | | refresh | Reload the current page. Returns page_state after reload. Use after setting intercept rules to replay the page load with rules active. Seq counter increments for temporal observation queries. |
Content Formats
The navigate tool's format parameter controls how the page is returned:
| Format | Description | |--------|-------------| | markdown | Full HTML → markdown conversion. All content preserved. | | fit_markdown | Recommended. Prunes boilerplate before conversion, saving 30-60% tokens on typical pages. | | text | Plain text, no markdown. | | html | Raw HTML. |
fit_markdown works in two passes:
- Hard-block removal, elements whose
classoridattribute contains any of these strings are removed immediately:nav,footer,header,sidebar,ads,comment,promo,advert,social,share. - Score-based pruning, remaining elements are scored; anything below 0.48 is removed. The score is:
``` 0.4 × text_density
- 0.2 × (1 − link_density)
- 0.2 × tag_weight
- 0.1 × classidscore
- 0.1 × ln(text_length + 1)
```
Tag weights: article = 1.5 · h1 = 1.2 · h2 = 1.1 · h3/p/section = 1.0 · h4 = 0.9 · h5/table = 0.8 · h6 = 0.7 · span = 0.3 · div/li/ul/ol = 0.5.
Use markdown instead of fit_markdown when: the page has important content inside elements named sidebar, nav, or similar, for example, metadata panels, related-article links, or author info stored in a sidebar div.
If fit_markdown prunes all content (empty result), the tool automatically falls back to plain text.
Interaction
| Tool | Description | |------|-------------| | click | Click an element by CSS selector, @eN ref, or visible label text. Use text (with optional role/region) to activate a button, tab, or link by its label — handy for SPA admin UIs and modals where CSS paths are fragile. Returns page_state after the click. | | click_at | Click at specific viewport coordinates (x, y). Use for canvas, maps, or SVGs. Returns page_state. | | fill_form | Fill form fields by selector, name, @eN ref, or visible label text — labels resolve page-wide, so fields in modals and div-based UIs without a ` boundary work too. Optional auto-submit. Returns pagestate. | | selectoption | Select a dropdown option by value, label, or index. Works on native and custom ARIA/portal dropdowns; omit value/label/index to list the available options without selecting. Returns page_state. | | hover` | Hover over an element to reveal tooltips o
…
Source & license
This open-source MCP server is cataloged on AgentStack and links to its original source — we do not rehost the code.
- Author: Mingye-Lu
- Source: Mingye-Lu/AgenticCrawler
- License: MIT
Install and usage instructions live in the source repository linked above.
Reviews
No reviews yet — be the first.
Write a review
Versions
- v0.10.0 Imported from the upstream source.