Skip to content

Commit fab3315

Browse files
authored
Merge pull request #58 from angiejones/bidi-support
Add WebDriver BiDi support for real-time browser diagnostics
2 parents 74caf79 + f52c27d commit fab3315

File tree

6 files changed

+403
-6
lines changed

6 files changed

+403
-6
lines changed

AGENTS.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,14 @@ mcp-selenium/
3737
├── browser.test.mjs ← start_browser, close_session, take_screenshot, multi-session
3838
├── navigation.test.mjs ← navigate, all 6 locator strategies
3939
├── interactions.test.mjs ← click, send_keys, get_element_text, hover, double_click, right_click, press_key, drag_and_drop, upload_file
40+
├── bidi.test.mjs ← BiDi enablement, console/error/network capture, session isolation
4041
└── fixtures/ ← HTML files loaded via file:// URLs
4142
├── locators.html
4243
├── interactions.html
4344
├── mouse-actions.html
4445
├── drag-drop.html
45-
└── upload.html
46+
├── upload.html
47+
└── bidi.html
4648
```
4749

4850
### Key Files in Detail
@@ -82,20 +84,37 @@ All browser state is held in a module-level `state` object:
8284
```js
8385
const state = {
8486
drivers: new Map(), // sessionId → WebDriver instance
85-
currentSession: null // string | null — the active session ID
87+
currentSession: null, // string | null — the active session ID
88+
bidi: new Map() // sessionId → { available, consoleLogs, pageErrors, networkLogs }
8689
};
8790
```
8891

8992
- **Session IDs** are formatted as `{browser}_{Date.now()}` (e.g., `chrome_1708531200000`)
9093
- Only one session is "current" at a time (set by `start_browser`, cleared by `close_session`)
9194
- Multiple sessions can exist in the `drivers` Map, but tools always operate on `currentSession`
95+
- **BiDi state** is a single Map of per-session objects — cleanup is one `state.bidi.delete(sessionId)` call
9296

9397
### Helper Functions
9498

9599
| Function | Purpose |
96100
|----------|---------|
97101
| `getDriver()` | Returns the WebDriver for `state.currentSession`. Throws if no active session. |
98102
| `getLocator(by, value)` | Converts a locator strategy string (`"id"`, `"css"`, `"xpath"`, `"name"`, `"tag"`, `"class"`) to a Selenium `By` object. |
103+
| `newBidiState()` | Returns a fresh `{ available, consoleLogs, pageErrors, networkLogs }` object for a new session. |
104+
| `setupBidi(driver, sessionId)` | Wires up BiDi event listeners (console, JS errors, network) for a session. Called from `start_browser`. |
105+
| `registerBidiTool(name, description, logKey, emptyMessage, unavailableMessage)` | Factory that registers a diagnostic tool. All three BiDi tools (`get_console_logs`, `get_page_errors`, `get_network_logs`) use this — don't copy-paste a new handler, call this instead. |
106+
107+
### Diagnostics (WebDriver BiDi)
108+
109+
The server automatically enables [WebDriver BiDi](https://w3c.github.io/webdriver-bidi/) when starting a browser session. BiDi provides real-time, passive capture of browser diagnostics — console messages, JavaScript errors, and network activity are collected in the background without any extra configuration.
110+
111+
This is especially useful for AI agents: when something goes wrong on a page, the agent can check `get_console_logs` and `get_page_errors` to understand *why*, rather than relying solely on screenshots.
112+
113+
- **Automatic**: BiDi is enabled by default when the browser supports it
114+
- **Graceful fallback**: If the browser or driver doesn't support BiDi, the session starts normally and the diagnostic tools return a helpful message
115+
- **No performance impact**: Logs are passively captured via event listeners — no polling or extra requests
116+
- **Per-session**: Each browser session has its own log buffers, cleaned up automatically on session close
117+
- **BiDi modules are dynamically imported** at the top of `server.js` — if the selenium-webdriver version doesn't include them, `LogInspector` and `Network` are set to `null` and all BiDi code is skipped
99118

100119
### Cleanup
101120

@@ -232,6 +251,7 @@ Tests talk to the real MCP server over stdio using JSON-RPC 2.0. No mocking.
232251
| `browser.test.mjs` | start_browser, close_session, take_screenshot, multi-session |
233252
| `navigation.test.mjs` | navigate, all 6 locator strategies (id, css, xpath, name, tag, class) |
234253
| `interactions.test.mjs` | click, send_keys, get_element_text, hover, double_click, right_click, press_key, drag_and_drop, upload_file |
254+
| `bidi.test.mjs` | BiDi enablement, console log capture, page error capture, network log capture, session isolation |
235255

236256
### When Adding a New Tool
237257

README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ A Model Context Protocol (MCP) server implementation for Selenium WebDriver, ena
2323
- Upload files
2424
- Support for headless mode
2525
- Manage browser cookies (add, get, delete)
26+
- **Real-time diagnostics** via WebDriver BiDi:
27+
- Console log capture (info, warn, error)
28+
- JavaScript error detection with stack traces
29+
- Network request monitoring (successes and failures)
2630

2731
## Supported Browsers
2832

@@ -791,6 +795,54 @@ Deletes cookies from the current browser session. Deletes a specific cookie by n
791795
}
792796
```
793797

798+
### get_console_logs
799+
Retrieves captured browser console messages (log, warn, error, etc.). Console logs are automatically captured in the background via WebDriver BiDi when the browser supports it — no configuration needed.
800+
801+
**Parameters:**
802+
| Parameter | Type | Required | Description |
803+
|-----------|------|----------|-------------|
804+
| clear | boolean | No | Clear the captured logs after retrieving them (default: false) |
805+
806+
**Example:**
807+
```json
808+
{
809+
"tool": "get_console_logs",
810+
"parameters": {}
811+
}
812+
```
813+
814+
### get_page_errors
815+
Retrieves captured JavaScript errors and uncaught exceptions with full stack traces. Errors are automatically captured in the background via WebDriver BiDi.
816+
817+
**Parameters:**
818+
| Parameter | Type | Required | Description |
819+
|-----------|------|----------|-------------|
820+
| clear | boolean | No | Clear the captured errors after retrieving them (default: false) |
821+
822+
**Example:**
823+
```json
824+
{
825+
"tool": "get_page_errors",
826+
"parameters": {}
827+
}
828+
```
829+
830+
### get_network_logs
831+
Retrieves captured network activity including successful responses and failed requests. Network logs are automatically captured in the background via WebDriver BiDi.
832+
833+
**Parameters:**
834+
| Parameter | Type | Required | Description |
835+
|-----------|------|----------|-------------|
836+
| clear | boolean | No | Clear the captured logs after retrieving them (default: false) |
837+
838+
**Example:**
839+
```json
840+
{
841+
"tool": "get_network_logs",
842+
"parameters": {}
843+
}
844+
```
845+
794846
## License
795847

796848
MIT

src/lib/server.js

Lines changed: 142 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,18 @@ import { Options as FirefoxOptions } from 'selenium-webdriver/firefox.js';
1010
import { Options as EdgeOptions } from 'selenium-webdriver/edge.js';
1111
import { Options as SafariOptions } from 'selenium-webdriver/safari.js';
1212

13+
// BiDi imports — loaded dynamically to avoid hard failures if not available
14+
let LogInspector, Network;
15+
try {
16+
LogInspector = (await import('selenium-webdriver/bidi/logInspector.js')).default;
17+
const networkModule = await import('selenium-webdriver/bidi/network.js');
18+
Network = networkModule.Network;
19+
} catch (_) {
20+
// BiDi modules not available in this selenium-webdriver version
21+
LogInspector = null;
22+
Network = null;
23+
}
24+
1325

1426
// Create an MCP server
1527
const server = new McpServer({
@@ -20,7 +32,8 @@ const server = new McpServer({
2032
// Server state
2133
const state = {
2234
drivers: new Map(),
23-
currentSession: null
35+
currentSession: null,
36+
bidi: new Map()
2437
};
2538

2639
// Helper functions
@@ -44,6 +57,80 @@ const getLocator = (by, value) => {
4457
}
4558
};
4659

60+
// BiDi helpers
61+
const newBidiState = () => ({
62+
available: false,
63+
consoleLogs: [],
64+
pageErrors: [],
65+
networkLogs: []
66+
});
67+
68+
async function setupBidi(driver, sessionId) {
69+
const bidi = newBidiState();
70+
71+
const logInspector = await LogInspector(driver);
72+
await logInspector.onConsoleEntry((entry) => {
73+
try {
74+
bidi.consoleLogs.push({
75+
level: entry.level, text: entry.text, timestamp: entry.timestamp,
76+
type: entry.type, method: entry.method, args: entry.args
77+
});
78+
} catch (_) { /* ignore malformed entry */ }
79+
});
80+
await logInspector.onJavascriptLog((entry) => {
81+
try {
82+
bidi.pageErrors.push({
83+
level: entry.level, text: entry.text, timestamp: entry.timestamp,
84+
type: entry.type, stackTrace: entry.stackTrace
85+
});
86+
} catch (_) { /* ignore malformed entry */ }
87+
});
88+
89+
const network = await Network(driver);
90+
await network.responseCompleted((event) => {
91+
try {
92+
bidi.networkLogs.push({
93+
type: 'response', url: event.request?.url, status: event.response?.status,
94+
method: event.request?.method, mimeType: event.response?.mimeType, timestamp: Date.now()
95+
});
96+
} catch (_) { /* ignore malformed event */ }
97+
});
98+
await network.fetchError((event) => {
99+
try {
100+
bidi.networkLogs.push({
101+
type: 'error', url: event.request?.url, method: event.request?.method,
102+
errorText: event.errorText, timestamp: Date.now()
103+
});
104+
} catch (_) { /* ignore malformed event */ }
105+
});
106+
107+
bidi.available = true;
108+
state.bidi.set(sessionId, bidi);
109+
}
110+
111+
function registerBidiTool(name, description, logKey, emptyMessage, unavailableMessage) {
112+
server.tool(
113+
name,
114+
description,
115+
{ clear: z.boolean().optional().describe("Clear after returning (default: false)") },
116+
async ({ clear = false }) => {
117+
try {
118+
getDriver();
119+
const bidi = state.bidi.get(state.currentSession);
120+
if (!bidi?.available) {
121+
return { content: [{ type: 'text', text: unavailableMessage }] };
122+
}
123+
const logs = bidi[logKey];
124+
const result = logs.length === 0 ? emptyMessage : JSON.stringify(logs, null, 2);
125+
if (clear) bidi[logKey] = [];
126+
return { content: [{ type: 'text', text: result }] };
127+
} catch (e) {
128+
return { content: [{ type: 'text', text: `Error: ${e.message}` }], isError: true };
129+
}
130+
}
131+
);
132+
}
133+
47134
// Common schemas
48135
const browserOptionsSchema = z.object({
49136
headless: z.boolean().optional().describe("Run browser in headless mode"),
@@ -69,6 +156,14 @@ server.tool(
69156
let builder = new Builder();
70157
let driver;
71158
let warnings = [];
159+
160+
// Enable BiDi websocket if the modules are available
161+
if (LogInspector && Network) {
162+
// 'ignore' prevents BiDi from auto-dismissing alert/confirm/prompt dialogs,
163+
// allowing accept_alert, dismiss_alert, and get_alert_text to work as expected.
164+
builder = builder.withCapabilities({ 'webSocketUrl': true, 'unhandledPromptBehavior': 'ignore' });
165+
}
166+
72167
switch (browser) {
73168
case 'chrome': {
74169
const chromeOptions = new ChromeOptions();
@@ -134,7 +229,19 @@ server.tool(
134229
state.drivers.set(sessionId, driver);
135230
state.currentSession = sessionId;
136231

232+
// Attempt to enable BiDi for real-time log capture
233+
if (LogInspector && Network) {
234+
try {
235+
await setupBidi(driver, sessionId);
236+
} catch (_) {
237+
// BiDi not supported by this browser/driver — continue without it
238+
}
239+
}
240+
137241
let message = `Browser started with session_id: ${sessionId}`;
242+
if (state.bidi.get(sessionId)?.available) {
243+
message += ' (BiDi enabled: console logs, JS errors, and network activity are being captured)';
244+
}
138245
if (warnings.length > 0) {
139246
message += `\nWarnings: ${warnings.join(' ')}`;
140247
}
@@ -473,10 +580,14 @@ server.tool(
473580
async () => {
474581
try {
475582
const driver = getDriver();
476-
await driver.quit();
477-
state.drivers.delete(state.currentSession);
478583
const sessionId = state.currentSession;
479-
state.currentSession = null;
584+
try {
585+
await driver.quit();
586+
} finally {
587+
state.drivers.delete(sessionId);
588+
state.bidi.delete(sessionId);
589+
state.currentSession = null;
590+
}
480591
return {
481592
content: [{ type: 'text', text: `Browser session ${sessionId} closed` }]
482593
};
@@ -681,6 +792,7 @@ server.tool(
681792
console.error(`Error quitting driver for session ${sessionId}:`, quitError);
682793
}
683794
state.drivers.delete(sessionId);
795+
state.bidi.delete(sessionId);
684796
state.currentSession = null;
685797
return {
686798
content: [{ type: 'text', text: 'Last window closed. Session ended.' }]
@@ -957,6 +1069,31 @@ server.tool(
9571069
}
9581070
);
9591071

1072+
// BiDi Diagnostic Tools
1073+
registerBidiTool(
1074+
'get_console_logs',
1075+
'returns browser console messages (log, warn, info, debug) captured via WebDriver BiDi. Useful for debugging page behavior, seeing application output, and catching warnings.',
1076+
'consoleLogs',
1077+
'No console logs captured',
1078+
'Console log capture is not available (BiDi not supported by this browser/driver)'
1079+
);
1080+
1081+
registerBidiTool(
1082+
'get_page_errors',
1083+
'returns JavaScript errors and exceptions captured via WebDriver BiDi. Includes stack traces when available. Essential for diagnosing why a page is broken or a feature isn\'t working.',
1084+
'pageErrors',
1085+
'No page errors captured',
1086+
'Page error capture is not available (BiDi not supported by this browser/driver)'
1087+
);
1088+
1089+
registerBidiTool(
1090+
'get_network_logs',
1091+
'returns network activity (completed responses and failed requests) captured via WebDriver BiDi. Shows HTTP status codes, URLs, methods, and error details. Useful for diagnosing failed API calls and broken resources.',
1092+
'networkLogs',
1093+
'No network activity captured',
1094+
'Network log capture is not available (BiDi not supported by this browser/driver)'
1095+
);
1096+
9601097
// Resources
9611098
server.resource(
9621099
"browser-status",
@@ -986,6 +1123,7 @@ async function cleanup() {
9861123
}
9871124
}
9881125
state.drivers.clear();
1126+
state.bidi.clear();
9891127
state.currentSession = null;
9901128
process.exit(0);
9911129
}

0 commit comments

Comments
 (0)