feat(codex): experimental Codex agent mode via the Agent Client Protocol by pjdoland · Pull Request #380 · plmbr/notebook-intelligence

pjdoland · 2026-06-22T18:57:25Z

Draft / request for feedback. I'm opening this early to start a conversation about the approach before polishing it, not to merge as-is. Feedback on the architecture, scope, and UX is very welcome.

Summary

This adds an opt-in, experimental Codex (OpenAI) agent mode that drives the chat panel over the Agent Client Protocol (ACP), the way Claude mode drives it today. It is off by default and gated behind an admin policy that defaults to force-off, so it has no effect unless explicitly enabled.

The goal of this first pass is to prove that an external coding agent can run the chat loop end to end (streaming replies, tool-call cards with diffs, per-tool approval) and invoke an NBI MCP tool, using ACP as the integration layer.

What's included

ACP client backend (acp_agent.py): a persistent worker thread that launches the Codex ACP agent and maps ACP events onto NBI's existing chat surfaces (markdown, tool-call cards, diffs, approvals). Turns are single-flight and the subprocess lifecycle is hardened. A small stdio MCP server (acp_mcp_server.py) exposes an NBI tool to the agent so the end-to-end tool path is exercised.
Settings + policy: a codex_settings block and a codex_mode admin policy that defaults to force-off, plus OPENAI_API_KEY / OPENAI_BASE_URL / NBI_CODEX_CHAT_MODEL overrides, following the existing policy and settings-override conventions.
Frontend: a Codex settings tab (enable, model, key, base URL) mirroring the Claude tab, and chat-panel awareness so the input and routing follow the active agent.
In-chat agent picker: shown only when more than one agent mode is enabled. It resolves a single active agent from the enabled modes plus a persisted preference (no preference keeps the historical "Claude wins" default). It is a custom icon dropdown, and the agent footer is unified so the Ask/Agent toggle and tool picker stay on the native model path while agent modes show only their brand badge.
Admin gating parity with Claude: a codex_full_access policy (default force-off) pins Codex to approval_policy=untrusted so it asks before anything beyond trusted read-only commands, clamped server-side on read and write. user-choice exposes a "Full access" toggle that flips the launch to approval_policy=never. Documented in the README admin table and a new admin-guide section.
Branding: the OpenAI mark for the Codex participant and settings tab.

Testing

Unit tests for the ACP-to-NBI mapping, approval (including fail-closed), the codex_mode / codex_full_access policy clamps, the active-agent resolution, the approval-arg mapping, and the credential scrub.
Verified live in a running JupyterLab: enabling Codex, a prompt that creates a file and calls the NBI MCP tool (tool-call card, diff, per-tool approval, MCP result), switching agents via the picker, the consolidated footer, and the full-access policy (force-off locks the toggle and pins untrusted; user-choice flips the launch flag to never).

Design questions I'd love feedback on

Default + scope. Force-off-by-default and Codex-first seem right for an experiment; happy to adjust.
The active-agent picker UX when multiple agents are enabled.

Risks / follow-ups

The agent picker updates optimistically and does not roll back if the config POST fails (low-likelihood local request).
In the rare both-agents-enabled config, a few native-path settings sections key on "Claude not active" and now show when Codex is active.
During testing the Codex agent occasionally returned an ACP "Internal error" that appears to be OpenAI-backend side (Claude runs fine through the same path); worth confirming on other setups.
Approval-pin precedence. The full-access gate pins the posture via codex-acp's -c approval_policy command-line override. In the API-key path NBI also isolates CODEX_HOME, so the config base is NBI-controlled; in the ChatGPT-auth path codex uses the user's own ~/.codex. The pin relies on codex honoring the -c override above any config it loads. A possible defense-in-depth follow-up is to also pin sandbox_mode (pending confirming codex honors it as a top-level override). On shared deployments, prefer API-key auth and keep NBI_CODEX_MODE_POLICY=force-off unless needed.
The Python ACP SDK trails the core schema in places; pinned to a known-good version here.

…col (plmbr#378) Add an opt-in Codex (OpenAI) agent mode that drives the chat panel over the Agent Client Protocol (ACP), the way Claude mode drives it today: streaming replies, tool-call cards with diffs, and per-tool approval. An NBI MCP tool is invoked end to end to prove the path. - ACP client backend (acp_agent.py) on a persistent worker thread, mapping ACP events onto NBI's chat surfaces, with single-flight turns and a hardened subprocess lifecycle. A small stdio MCP server (acp_mcp_server.py) exposes an NBI tool to the agent. - codex_settings + a codex_mode admin policy that defaults to force-off, plus OPENAI_API_KEY / OPENAI_BASE_URL / NBI_CODEX_CHAT_MODEL overrides, following the existing seven-place policy and settings-override conventions. - Frontend: a Codex settings tab (enable, model, key, base URL) mirroring the Claude tab, and chat-panel awareness so the input and routing follow Codex. - An in-chat agent picker (shown when more than one agent mode is enabled) that resolves a single active agent from the enabled modes and a persisted preference; with no preference the historical Claude-wins default holds. The picker is a custom icon dropdown, and the agent footer is unified (Ask/Agent and tools stay on the native model path; agent modes show only their badge). - Consistent branding: the OpenAI mark for the Codex participant and tab. - Admin gating parity with Claude: a codex_full_access policy (default force-off) pins Codex to approval_policy=untrusted so it asks before risky actions, clamped server-side; user-choice exposes a "Full access" toggle that flips the launch to approval_policy=never. README and admin-guide document the controls, the CODEX_HOME isolation, and the residual precedence caveat. Unit tests cover the ACP-to-NBI mapping, approval, the policy clamps, the active-agent resolution, the approval-arg mapping, and the credential scrub. The live path was verified in a running JupyterLab.

pjdoland added the enhancement New feature or request label Jun 22, 2026

pjdoland force-pushed the feat/378-codex-acp-mode branch from 6c12439 to f842fdf Compare June 22, 2026 19:37

pjdoland requested review from CCDevelopForFun and mbektas June 22, 2026 20:52

pjdoland assigned mbektas and CCDevelopForFun Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(codex): experimental Codex agent mode via the Agent Client Protocol#380

feat(codex): experimental Codex agent mode via the Agent Client Protocol#380
pjdoland wants to merge 1 commit into
plmbr:mainfrom
pjdoland:feat/378-codex-acp-mode

pjdoland commented Jun 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pjdoland commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Testing

Design questions I'd love feedback on

Risks / follow-ups

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pjdoland commented Jun 22, 2026 •

edited

Loading