This extension enables you to control a real Chrome browser instance to inspect, debug, and automate web interactions via the Model Context Protocol (MCP).
You have access to a suite of tools categorized by function. Use these tools to perform actions in the browser.
new_page({ url }): Open a new tab/window with the specified URL.navigate_page({ url }): Navigate the currently selected page to a new URL.list_pages(): List all open tabs/pages. Always check this first if you are unsure of the current browser state or which page is active.select_page({ pageIdx }): Switch the active context to a different tab by its index (fromlist_pages).close_page({ pageIdx }): Close a specific tab.
Note: Most interaction tools require a uid (Unique Identifier) for the target element.
click({ uid }): Click an element identified by itsuid.fill({ uid, value }): Type text into an input field identified by itsuid.press_key({ key }): Send keyboard events (e.g., 'Enter', 'Tab', 'ArrowDown'). Useful when specific UI elements are hard to target or for global shortcuts.hover({ uid }): Hover the mouse cursor over an element.drag({ from_uid, to_uid }): Drag an element to another location.
take_snapshot(): CRITICAL TOOL. Returns a structural representation (Accessibility Tree) of the current page withuids for every interactable element.- Usage: Call this before trying to click or fill anything to find the correct
uid.
- Usage: Call this before trying to click or fill anything to find the correct
take_screenshot(): Captures a visual screenshot of the current viewport or specific element. Use this when visual verification is needed.list_console_messages(): specific debugging of JavaScript errors or logs.evaluate_script({ function }): Execute custom JavaScript in the page context.- Usage: Best for complex data extraction (scraping) or interactions that standard tools cannot handle.
list_network_requests(): View recent network activity. Useful for debugging API calls or resource loading issues.performance_start_trace()/performance_stop_trace(): Record a performance profile to analyze loading speed or runtime performance.
-
The "Snapshot-First" Workflow:
-
You cannot interact with elements using CSS selectors or XPaths directly in tool calls.
-
Step 1: Call
take_snapshot()to get the current page structure. -
Step 2: Analyze the returned tree to find the
uidof the element you want (e.g., a button with text "Submit"). -
Step 3: Call
click({ uid: "..." })using that specific ID. -
Do not guess UIDs. They are dynamic and generated per snapshot.
-
Browser Context:
-
This extension controls a specific Chrome instance with its own dedicated profile.
-
By default, this profile persists between sessions (in your cache) but is separate from your main personal Chrome profile.
-
Do not assume you have access to the user's personal cookies, sessions, or history unless explicitly stated.
-
-
Handling Dynamic Content:
- After an action that loads new content (like clicking "Next"), the page might take a moment to update.
- Use
wait_for({ text: "..." })if you need to ensure specific content is visible before proceeding. - If an action fails with "element not found", the page likely updated. Run
take_snapshot()again to get fresh UIDs.
-
Debugging vs. Visuals:
- Use
take_snapshotfor functional understanding (what can I click?). - Use
take_screenshotfor visual understanding (does this look right?).
- Use
- If
take_snapshotreturns too much data: The output might be truncated. Focus on specific sections if possible, or useevaluate_scriptto query specific properties. - If the browser seems stuck: Use
navigate_page({ type: "reload" })to refresh. - If tool calls fail repeatedly: The
uids might be stale. Always refresh your knowledge of the page with a newtake_snapshotafter significant interactions.