Add live golden-path (Tier 2) pipeline for azd ai agent extension by v1212 · Pull Request #8758 · Azure/azure-dev

v1212 · 2026-06-22T11:39:16Z

Summary

Adds eng/pipelines/ext-azure-ai-agents-live.yml: an on-demand / weekly Azure DevOps pipeline that runs the Tier 2 live golden path (init → provision → deploy → invoke → down) for the azure.ai.agents extension against a real Azure (TME) subscription, for both code and container deploy modes.
Migrates the Tier 2 tmux driver (test_full_e2e.py, test_tier2.py) from the test: add static E2E tests for azure.ai.agents extension #8692 prototype into cli/azd/extensions/azure.ai.agents/tests/e2e-live/, adapting CI auth detection for Azure DevOps (TF_BUILD / E2E_USE_AZ_CLI_AUTH).
Adds a README.md documenting how to run (CI + local) and the one-time ADO setup.

Why a separate live pipeline

Per Azure SDK EngSys / SFI guidance, live Azure access must stay out of the automatic PR pipeline. This pipeline is trigger: none / pr: none and runs only:

on demand: comment /azp run ext-azure-ai-agents-live (requires write permission), or
weekly: Monday 07:00 UTC.

Together with the PR-gate tests in #8754 (Tier 0 offline + Tier 1 recording/playback), this covers Tier 0/1/2 from the original prototype (#8692).

Next steps to land this (admin / EngSys — cannot be done in PR code)

These are required to actually exercise the pipeline; the PR itself is inert in GitHub CI by design.

Register the ADO pipeline as ext-azure-ai-agents-live (the exact name /azp run uses) against this repo + YAML path.
Confirm the azure-sdk-tests service connection (the serviceConnection parameter default) maps to the TME subscription with RBAC to create Foundry projects and deploy models (Contributor + Azure AI Developer + Cognitive Services Contributor, or equivalent).
Kick the first /azp run ext-azure-ai-agents-live — the first live validation of the keepAzSessionActive auth path (test + cleanup run inside AzureCLI@2 so the WIF az session survives the full multi-minute run).

The GitHub token for init is already wired via the ambient azuresdk-github-pat org secret ($(azuresdk-github-pat)), so no extra secret setup is needed.

Testing

The Tier 2 flow was validated end-to-end in the test: add static E2E tests for azure.ai.agents extension #8692 fork run (code ~669s, container ~711s).
This PR: pipeline YAML parses; both Python files py_compile clean; README.md passes cspell; the streaming / timeout / cleanup logic and the invoke-assertion regex in the Tier 2 driver were validated with standalone simulations.
Addressed multiple Copilot review passes, including: ambient-PAT secret handling, set -euo pipefail + arch-derived binary name, shell-escaping (shlex.quote) of all interpolated values, a watchdog-enforced hard timeout with live-streamed child output, tmux-server teardown on timeout, per-mode AZD_CONFIG_DIR cleanup, a destructive-rm -rf guardrail, bounded setup() subprocess calls, and a robust standalone-token check for the live invoke result (accepts 4 or four, ignores incidental 4s such as gpt-4o-mini / 4.1 / 404).
The ADO pipeline itself only runs once registered in ADO (inert in GitHub CI by design).

Adds eng/pipelines/ext-azure-ai-agents-live.yml, an on-demand/weekly Azure DevOps pipeline that drives the real 'azd ai agent' CLI through tmux against live Azure (TME), exercising init -> provision -> deploy -> invoke -> down for both code and container deploy modes. This is the live counterpart to the PR-gate checks (Tier 0 offline + Tier 1 recording/playback in #8754). Per Azure SDK EngSys / SFI guidance, live access stays out of the automatic PR pipeline (trigger: none) and runs only via '/azp run ext-azure-ai-agents-live' or the weekly schedule. The Tier 2 tmux driver (test_full_e2e.py, test_tier2.py) is migrated from the #8692 prototype; CI auth detection is extended to recognize Azure DevOps (TF_BUILD) and an explicit E2E_USE_AZ_CLI_AUTH override.

…zSessionActive The azure-sdk-tests service connection uses Workload Identity Federation, whose az session is isolated to the task's private AZURE_CONFIG_DIR and expires after ~10 min. Running the ~50 min golden-path test (and the cleanup) as plain bash steps after a separate login step would fail auth on both counts. Run them inside AzureCLI@2 with keepAzSessionActive:true (matching build-cli.yml) so the session stays refreshed and reaches azd (auth.useAzCliAuth) through tmux, which inherits AZURE_CONFIG_DIR. Subscription/tenant are now read in-script via az account show instead of cross-step pipeline variables.

test_tier2.py always ran sequentially, but kept a tautological if-condition (len==1 or len>1), an unused concurrent.futures import, a no-op --serial flag, and a docstring/print claiming parallel execution. Simplify to an explicit sequential loop and update the docstring to match. Also fix test_full_e2e.py's module docstring to point at README.md (LOCAL-TEST-GUIDE.md does not exist).

Copilot

Pull request overview

This PR adds a dedicated Azure DevOps pipeline and supporting Python drivers/docs to run the Tier 2 live golden-path E2E for the azure.ai.agents azd extension (init → provision → deploy → invoke → down) against real Azure resources, outside of the GitHub PR gate.

Changes:

Added an on-demand + weekly ADO pipeline (ext-azure-ai-agents-live) to run Tier 2 live E2E for both code and container deploy modes.
Added a tmux-driven Python Tier 2 runner (test_tier2.py) and a full golden-path driver (test_full_e2e.py) adapted for ADO CI auth detection.
Added documentation for running the live Tier 2 tests locally and in CI.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
eng/pipelines/ext-azure-ai-agents-live.yml	New ADO pipeline wiring to build azd + extension, run Tier 2 live E2E under AzureCLI@2, and publish logs/cleanup.
cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py	Tier 2 orchestrator: runs code + container golden paths sequentially with isolation and timeout handling.
cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py	tmux-driven end-to-end golden path driver (init/provision/deploy/invoke/down) with CI/local auth switching.
cli/azd/extensions/azure.ai.agents/tests/e2e-live/README.md	Docs for CI setup (ADO registration/service connection/secrets) and local WSL execution.

- Use the ambient azure-sdk org secret `azuresdk-github-pat` for GH_TOKEN instead of an empty `GitHubPat` placeholder variable (mirrors eval-waza.yml); removes a misleading masked variable and the need for admin PAT setup. - Harden the AzureCLI@2 inline script: `set -euo pipefail` and assign-then-verify subscription/tenant so an `az account show` failure fails fast (a plain `export X=$(...)` would have masked the error from set -e). - Reword the extension-install comment to be self-contained (it no longer inaccurately claims to mirror lint-ext-azure-ai-agents.yml). - Clarify the test_full_e2e.py auth prerequisite: only local WSL runs leave auth.useAzCliAuth unset; CI auto-enables az CLI auth. - Clear tmux scrollback after env setup so the exported GH token cannot leak into capture() output on failures/timeouts. - _cleanup_leaked_resources now checks azd down's return code and reports failures instead of always printing "Cleanup complete".

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

- Stream child E2E output live with a watchdog-enforced hard timeout instead of buffering everything via capture_output - Shell-escape the GitHub token (shlex.quote) before exporting in tmux - Clean up the per-mode AZD_CONFIG_DIR temp copy unless E2E_KEEP_ARTIFACTS - Use sha256 instead of md5 for the agent-name uniqueness suffix - Derive the agent binary arch from uname -m instead of hard-coding amd64

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

- Shell-escape HOME/PATH/TENANT, the cd target, and the agent name with shlex.quote() (consistent with the earlier token fix) - On Tier 2 timeout, kill the child's detached tmux server so reused CI agents do not accumulate orphaned tmux sockets

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

… (Copilot round 7)

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

github-actions · 2026-06-22T14:41:25Z

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Review may take a bit longer — reach out to @rajeshkamal5050 or @kristenwomack if you'd like to discuss prioritization.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

…out pipe (Copilot round 9)

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

…um (Copilot round 10)

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

…ned python in CI (Copilot round 11)

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

… macro (Copilot round 12)

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

v1212 · 2026-06-22T16:27:22Z

/check-enforcer evaluate

v1212 added ext-agents azure.ai.agents extension ext-foundry azure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes}, microsoft.foundry labels Jun 22, 2026

microsoft-github-policy-service Bot assigned v1212 Jun 22, 2026

Jian Wu added 2 commits June 22, 2026 19:54

v1212 marked this pull request as ready for review June 22, 2026 12:08

Copilot AI review requested due to automatic review settings June 22, 2026 12:08

v1212 requested review from JeffreyCA, RickWinter, danieljurek, glharper, tg-msft, therealjohn, trangevi, trrwilson and vhvb1989 as code owners June 22, 2026 12:08

Copilot started reviewing on behalf of v1212 June 22, 2026 12:09 View session