Skip to content

Commit 5eb080b

Browse files
LyalinDotComclaude
andcommitted
docs: add Gemini CLI extension context for better tool usage
Add gemini-extension.md as a system instruction file for Gemini CLI. When users install this extension, Gemini CLI will use this context to understand how to properly use the Chrome DevTools MCP tools. This enables Gemini CLI to respond intelligently to user requests like "test my web page" by knowing to: - Use take_snapshot() first to get element UIDs - Follow the snapshot-first workflow for interactions - Handle dynamic content and stale UIDs appropriately Updated gemini-extension.json with contextFileName property to reference the new context file. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 28b8ff8 commit 5eb080b

2 files changed

Lines changed: 65 additions & 0 deletions

File tree

gemini-extension.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
{
22
"name": "chrome-devtools-mcp",
33
"version": "latest",
4+
"contextFileName": "gemini-extension.md",
45
"mcpServers": {
56
"chrome-devtools": {
67
"command": "npx",

gemini-extension.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Chrome DevTools MCP Extension
2+
3+
This extension enables you to control a real Chrome browser instance to inspect, debug, and automate web interactions via the Model Context Protocol (MCP).
4+
5+
## Core Capabilities & Tools
6+
7+
You have access to a suite of tools categorized by function. Use these tools to perform actions in the browser.
8+
9+
### 1. Navigation & Page Management
10+
- **`new_page({ url })`**: Open a new tab/window with the specified URL.
11+
- **`navigate_page({ url })`**: Navigate the *currently selected* page to a new URL.
12+
- **`list_pages()`**: List all open tabs/pages. **Always check this first** if you are unsure of the current browser state or which page is active.
13+
- **`select_page({ pageIdx })`**: Switch the active context to a different tab by its index (from `list_pages`).
14+
- **`close_page({ pageIdx })`**: Close a specific tab.
15+
16+
### 2. Interaction (Input)
17+
*Note: Most interaction tools require a `uid` (Unique Identifier) for the target element.*
18+
19+
- **`click({ uid })`**: Click an element identified by its `uid`.
20+
- **`fill({ uid, value })`**: Type text into an input field identified by its `uid`.
21+
- **`press_key({ key })`**: Send keyboard events (e.g., 'Enter', 'Tab', 'ArrowDown'). Useful when specific UI elements are hard to target or for global shortcuts.
22+
- **`hover({ uid })`**: Hover the mouse cursor over an element.
23+
- **`drag({ from_uid, to_uid })`**: Drag an element to another location.
24+
25+
### 3. Inspection & Debugging
26+
- **`take_snapshot()`**: **CRITICAL TOOL.** Returns a structural representation (Accessibility Tree) of the current page with `uid`s for every interactable element.
27+
- **Usage**: Call this *before* trying to click or fill anything to find the correct `uid`.
28+
- **`take_screenshot()`**: Captures a visual screenshot of the current viewport or specific element. Use this when visual verification is needed.
29+
- **`list_console_messages()`**: specific debugging of JavaScript errors or logs.
30+
- **`evaluate_script({ function })`**: Execute custom JavaScript in the page context.
31+
- **Usage**: Best for complex data extraction (scraping) or interactions that standard tools cannot handle.
32+
33+
### 4. Network & Performance
34+
- **`list_network_requests()`**: View recent network activity. Useful for debugging API calls or resource loading issues.
35+
- **`performance_start_trace()` / `performance_stop_trace()`**: Record a performance profile to analyze loading speed or runtime performance.
36+
37+
## Operational Guidelines
38+
39+
1. **The "Snapshot-First" Workflow**:
40+
- You cannot interact with elements using CSS selectors or XPaths directly in tool calls.
41+
- **Step 1**: Call `take_snapshot()` to get the current page structure.
42+
- **Step 2**: Analyze the returned tree to find the `uid` of the element you want (e.g., a button with text "Submit").
43+
- **Step 3**: Call `click({ uid: "..." })` using that specific ID.
44+
- *Do not guess UIDs.* They are dynamic and generated per snapshot.
45+
46+
- **Browser Context**:
47+
- This extension controls a *specific* Chrome instance with its own dedicated profile.
48+
- By default, this profile **persists** between sessions (in your cache) but is **separate** from your main personal Chrome profile.
49+
- Do not assume you have access to the user's personal cookies, sessions, or history unless explicitly stated.
50+
51+
3. **Handling Dynamic Content**:
52+
- After an action that loads new content (like clicking "Next"), the page might take a moment to update.
53+
- Use `wait_for({ text: "..." })` if you need to ensure specific content is visible before proceeding.
54+
- If an action fails with "element not found", the page likely updated. Run `take_snapshot()` again to get fresh UIDs.
55+
56+
4. **Debugging vs. Visuals**:
57+
- Use `take_snapshot` for *functional* understanding (what can I click?).
58+
- Use `take_screenshot` for *visual* understanding (does this look right?).
59+
60+
## Common Troubleshooting
61+
62+
- **If `take_snapshot` returns too much data**: The output might be truncated. Focus on specific sections if possible, or use `evaluate_script` to query specific properties.
63+
- **If the browser seems stuck**: Use `navigate_page({ type: "reload" })` to refresh.
64+
- **If tool calls fail repeatedly**: The `uid`s might be stale. Always refresh your knowledge of the page with a new `take_snapshot` after significant interactions.

0 commit comments

Comments
 (0)