feat(uipath-governance): path to green — operate tests, diagnose structure, description verbs by t-hsia · Pull Request #1651 · UiPath/skills

t-hsia · 2026-06-23T18:53:18Z

Summary

Addresses coverage gaps for Governance (uip gov) on the Coding Agents Scorecard, following the same pattern as the admin path-to-green (#1578).

Current scores: Build 50%, Operate 30%, Diagnose 15%

Changes

SKILL.md updates:

Added Operate: and Diagnose: verb phrases to description (856 chars, under 1024 limit)
New ### Troubleshoot section in "When to Use" with 6 trigger patterns
3 new Task Navigation rows for diagnose references

references/diagnose/ directory (3 files, follows maestro-bpmn/admin pattern):

CAPABILITY.md — structured capability index for diagnose mode (8 triggers, 6 rules, 8 common tasks)
references/failure-modes.md — 6 named failure patterns: policy not taking effect, wrong policy applied, access policy too broad/narrow, deployed policy empty, policy create rejected
references/troubleshooting-guide.md — 5-step diagnostic priority ladder

New mode:operate smoke tests (3 files — previously zero!):

aops_deploy_tenant_smoke — deploy policy to tenant
aops_deployed_policy_query_smoke — query effective deployed policy
access_evaluate_smoke — evaluate access policy rules

New mode:diagnose smoke tests (2 files):

aops_diagnose_policy_not_applied_smoke — investigate policy not taking effect
access_diagnose_blocked_invocation_smoke — diagnose blocked tool invocation

Test distribution after changes: 17 total — 5 build, 3 operate, 9 diagnose

Scorecard impact (expected)

Metric	Before	After (expected)
Operate tests	0	3
Diagnose tests	7	9
Operate Product Coverage	30%	~45% (deploy, query deployed, evaluate)
Diagnose Product Coverage	15%	~35% (diagnose/ structure + failure modes)

Not in scope

Source Control, CI/CD Pipelines, Feed Management — no uip CLI surface exists
These product capabilities create a structural ceiling regardless of skill maturity

Test plan

All 17 governance test YAMLs parse as valid YAML
All tests have uipath-governance skill tag + mode:* tag
New smoke tests validate correct CLI command shapes
hooks/validate-skill-descriptions.sh passes
diagnose/ reference links resolve to existing files

🤖 Generated with Claude Code

…green Add operate and diagnose verbs to SKILL.md description, a Troubleshoot subsection under "When to Use", diagnose reference navigation rows, and the full references/diagnose/ directory (CAPABILITY.md, failure-modes.md, troubleshooting-guide.md). Add 3 mode:operate smoke tests (deploy tenant, query deployed policy, evaluate access policy) and 2 mode:diagnose smoke tests (policy not applied, blocked invocation). Targets Build/Operate/ Diagnose scorecard improvement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-23T18:53:35Z

Claude finished @t-hsia's task in 3m 10s —— View job

PR Review: feat(uipath-governance): path to green

Read review criteria and project conventions
Read full diff and all changed files
Validate skill structure and SKILL.md changes
Validate diagnose reference files
Validate test task YAMLs
Check links, CODEOWNERS, cross-references
Post final review

Summary

Adds operate and diagnose coverage for the governance skill: 3 new operate smoke tests, 2 new diagnose smoke tests, a references/diagnose/ directory with capability index + failure modes + troubleshooting guide, description verb additions, and a new Troubleshoot section in SKILL.md. Well-structured PR that follows the established path-to-green pattern.

Change-by-Change Review

1. `skills/uipath-governance/SKILL.md` — description update + Troubleshoot section

Severity: OK

Description updated from ~690 chars to ~833 chars — well under the 1024 limit. The Operate: and Diagnose: verb phrases are compact and front-load key actions. The new ### Troubleshoot section under "When to Use" adds 6 clear trigger patterns that cover the diagnostic surface. Three new Reference Navigation rows point to the new diagnose files with correct relative paths (from SKILL.md's location in the skill root, ./references/diagnose/... resolves correctly).

2. `skills/uipath-governance/references/diagnose/CAPABILITY.md` — capability index

Severity: OK

Clean capability index: 8 "When to use" triggers, 6 critical rules, 8 common tasks with deep-links, strong anti-patterns section. Cross-capability links (../aops-policy/..., ../access-policy/..., ../disambiguation-guide.md) all resolve correctly from references/diagnose/. The ../../SKILL.md back-link is correct. Follows the pattern established by other capability indexes in the repo.

3. `skills/uipath-governance/references/diagnose/references/failure-modes.md` — 6 failure patterns

Severity: High — broken cross-capability links

The failure mode content is excellent — 6 well-structured patterns with symptom/cause/investigation/fix. Every CLI command includes --output json. However, all 6 cross-capability links are broken. The file is at references/diagnose/references/failure-modes.md (two levels below references/), but the links use ../ (one level up = diagnose/) instead of ../../ (two levels up = references/).

Broken links (all need ../ → ../../):

Line 65: ../aops-policy/aops-policy-deploy-guide.md → resolves to diagnose/aops-policy/... (doesn't exist)
Line 102: ../access-policy/plugins/actor/impl.md → resolves to diagnose/access-policy/... (doesn't exist)
Line 135: ../access-policy/plugins/tags/planning.md → same issue
Line 135: ../access-policy/plugins/selector/impl.md → same issue
Line 167: ../aops-policy/aops-policy-deploy-guide.md → same issue
Line 200: ../aops-policy/configure-aops-policy-data-guide.md → same issue

Fix this →

4. `skills/uipath-governance/references/diagnose/references/troubleshooting-guide.md` — diagnostic ladder

Severity: OK

5-step sequential triage workflow. Cross-references at lines 19 and 100 correctly use ../../ (two levels up). The step-by-step structure with interpret/verify sections is exactly what an agent needs. Clean and complete.

5. `tests/tasks/uipath-governance/aops-policy/aops_deploy_tenant_smoke.yaml`

Severity: Low — complex regex, otherwise good

Tags correct (uipath-governance first, smoke, mode:operate). Prompt is goal-oriented with clear auth-error guidance. The command_pattern on line 43 uses multiple lookaheads ((?=.*--tenant-name\s)(?=.*--input\s)(?=.*--output\s+json)) which is correct but fragile — if the agent puts flags before the tenant ID, the .*00000000-0000-0000-0000-000000000099.* prefix won't match. This is acceptable for a smoke test since the natural ordering puts the positional arg first, but worth noting.

6. `tests/tasks/uipath-governance/aops-policy/aops_deployed_policy_query_smoke.yaml`

Severity: OK

Clean smoke test. command_pattern validates the three positional args in order (NoLicense.*AITrustLayer.*00000000-...) plus --output json. Single primary criterion with correct weight (2.5). Tags correct.

7. `tests/tasks/uipath-governance/aops-policy/aops_diagnose_policy_not_applied_smoke.yaml`

Severity: OK

Good diagnostic smoke test. Checks two commands: deployed-policy get|list (weight 2.5) and deployment group get|list (weight 2.0). The | alternation in patterns is a nice touch — allows flexibility in how the agent approaches diagnosis. Tags correctly include mode:diagnose.

8. `tests/tasks/uipath-governance/access-policy/access_evaluate_smoke.yaml`

Severity: OK

Validates access-policy evaluate with both --resource-type and --actor-process-type. Weights well-distributed (2.5 primary, 1.5 supporting, 1.0 file). Tags correct.

9. `tests/tasks/uipath-governance/access-policy/access_diagnose_blocked_invocation_smoke.yaml`

Severity: OK

Validates the diagnose workflow: evaluate + list. Good that it tests two commands (evaluate for the symptom, list for candidate policies). Tags correct.

What's Missing

No sandbox: or agent: overrides in any test YAML — correct; all inherit from tests/experiments/default.yaml.
No env_packages — correct per test-writing rules (CLI installed globally by runner).
Missing passing-run claims — the test-writing rules say to "run the task with coder-eval and add a passing-run claim to the PR." None of the 5 new tasks have passing-run claims. This is flagged by lint as High, though the PR description notes this is about scorecard coverage and the tests validate command shapes (which can't pass without a live tenant). Medium — add passing-run claims or note why they're deferred.

Area Ratings

Area	Status	Notes
Frontmatter	OK	833 chars, under 1024 limit. Operate/Diagnose verbs added cleanly.
E2E Tests	OK	Pre-existing e2e tests cover build lifecycle. New tests are all smoke (correct for operate/diagnose command-shape validation).
Skill Body	OK	Troubleshoot section well-placed. Reference Navigation updated.
References & Assets	Issue	6 broken relative links in `failure-modes.md` (`../` should be `../../`).
Repo Hygiene	OK	CODEOWNERS covers both `/skills/uipath-governance/` and `/tests/tasks/uipath-governance/`. No secrets. Changes scoped to governance skill.

Issues for Manual Review

CLI command accuracy: The deployment tenant configure verb (in aops_deploy_tenant_smoke.yaml) and the --tenant-name / --input flags should be verified against the actual CLI. The reviewer should confirm these flags exist on the current @uipath/cli release.
access-policy evaluate flag names: Verify --resource-type, --resource-id, --actor-process-type, --actor-process-id are the actual flag names (not --resourceType etc.).
Failure mode fix references: Lines 102 and 135 of failure-modes.md reference plugins/actor/impl.md, plugins/tags/planning.md, and plugins/selector/impl.md — these exist in the repo but the content should be verified to contain the referenced anchors (#deny-to-allow-flip, Robot intent section).

Conclusion

Solid path-to-green PR that follows established patterns well. One High issue blocks merge: the 6 broken relative links in failure-modes.md where ../ needs to be ../../. This is a straightforward fix (all instances are the same off-by-one in directory depth). Everything else — SKILL.md changes, CAPABILITY.md, troubleshooting guide, and all 5 test YAMLs — looks clean and well-structured.

Recommendation: Request changes — fix the broken links in failure-modes.md, then this is ready to merge.

github-actions · 2026-06-23T18:53:35Z

Claude finished @t-hsia's task in 3m 42s —— View job

Coder-eval task lint (advisory)

5 task YAMLs changed; per-task verdicts: 0 Critical, 0 High, 1 Medium, 0 Low, 4 OK.

Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge.

Evidence of passing run

❌ High — PR body does not claim the changed tasks have been run and passed. The test plan section contains unchecked - [ ] items describing what to verify, but no assertion that any task was actually executed. Please edit the PR description to add a line like:

Ran <task-id> locally and it passed.

Per-task lint

`tests/tasks/uipath-governance/access-policy/access_diagnose_blocked_invocation_smoke.yaml` — verdict: OK

No issues. Goal-oriented prompt ("diagnose why an agent is being blocked") without prescribing access-policy evaluate or access-policy list. Two distinct command_executed criteria validate a multi-step diagnostic workflow.

`tests/tasks/uipath-governance/access-policy/access_evaluate_smoke.yaml` — verdict: OK

No issues. Single-command operate test with two complementary command_executed checks (--resource-type + --actor-process-type). Prompt states the goal without naming the CLI subcommand.

`tests/tasks/uipath-governance/aops-policy/aops_deploy_tenant_smoke.yaml` — verdict: OK

No issues. Primary criterion uses lookaheads to verify tenant ID, --tenant-name, --input, and --output json in a single pattern — strong shape validation for a smoke test.

`tests/tasks/uipath-governance/aops-policy/aops_deployed_policy_query_smoke.yaml` — verdict: Medium

Issues:

[Medium] Near-duplicate: existing deployed_policy_smoke.yaml (same folder) already tests deployed-policy get with three positional args + --output json. This task uses different fixture values (NoLicense/AITrustLayer vs Attended/StudioX) and tags mode:operate instead of mode:diagnose, but the CLI operation under test is structurally identical. The existing task additionally validates deployed-policy list, making it a superset.

Suggested fixes:

Consider whether the mode:operate tag alone justifies a separate task. If yes, consider strengthening this task by adding a criterion the existing diagnose task doesn't cover (e.g., checking that the agent does NOT also run deployed-policy list — since an operate query should be targeted, not exploratory). Otherwise, consider adding mode:operate as a second tag to the existing deployed_policy_smoke.yaml and dropping this file.

`tests/tasks/uipath-governance/aops-policy/aops_diagnose_policy_not_applied_smoke.yaml` — verdict: OK

No issues. Tests a two-step diagnostic workflow (deployed-policy get|list + deployment group get|list) that is materially distinct from all existing siblings. Good use of alternation in command_pattern to accept both get and list.

Within-PR duplicates

[Medium] Cluster: access_evaluate_smoke.yaml + access_diagnose_blocked_invocation_smoke.yaml — both share an identical primary command_executed criterion (uip gov access-policy evaluate ... --resource-type ... --output json, weight 2.5, same regex). Each adds marginal coverage: the operate task checks --actor-process-type, the diagnose task checks access-policy list. Not interchangeable, but an agent that passes one is likely to pass the shared criterion of the other. Consider differentiating the evaluate regex (e.g., the diagnose task could drop the --resource-type assertion on evaluate and rely on it only for the list step).

Conclusion

⚠ 1 task has a per-task issue (max severity Medium), plus 1 within-PR cluster (Medium) and a missing passing-run claim (High). Advisory only — not blocking merge. The biggest action item is adding a passing-run claim to the PR description.

Note: CLI verb reachability checks (scripts/check-cli-verbs.py) could not be run due to sandbox permissions. Manually inspected all command_pattern values — all reference uip gov access-policy or uip gov aops-policy subcommands consistent with the existing test suite.

…prompt consistency in path-to-green PR - failure-modes.md: fix `template get "<TEMPLATE_ID>"` placeholder to `"<PRODUCT_NAME>"` matching the actual CLI positional argument - aops_deployed_policy_query_smoke: change lifecycle:setup to lifecycle:discover (querying deployed policy is read-only) - access_evaluate_smoke: change lifecycle:setup to lifecycle:discover (evaluate is a PDP query, not a mutation) - aops_diagnose_policy_not_applied_smoke: fix prompt saying "Studio policy" when the scenario uses product AITrustLayer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ID requirement Agent may not place the tenant GUID as a positional arg before the flags. Relax regex to check for `deployment tenant configure` with --input and --output json, without requiring exact GUID position. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

grant-tannert · 2026-06-23T22:21:53Z

+
+**Symptom:** Admin deployed a policy to a tenant (or group), but a specific user still sees the old behavior or no policy at all.
+
+**Causes:**


I would add scenario where user has multiple group policy and the one with lower priority wins instead of intended one.

The solution is to lower (make it more important) the priority of the new policy.

grant-tannert · 2026-06-23T22:30:53Z

+Compare the assignments at each level for the target product:
+- If a user override exists (including explicit `null`), it wins over group and tenant.
+- If a group override exists, it wins over tenant for members of that group.
+- A `null` override means explicit "No Policy" — it is NOT the same as "no assignment" (which would inherit from the next level).


Similar to above, I would add above this line to also compare group policies priorities where lower priority wins.

grant-tannert

approved with comments

IsabellaCapriottiUIPath · 2026-06-23T22:50:15Z

+1. Selector rule too broad — matches resources beyond the intended scope
+2. Actor rule excludes the calling user/group
+3. Executable rule blocks the actor process type
+4. Tags on the resource trigger a deny-intent policy


Nitpick: "Tags on the resource or executable trigger a deny-intent policy"

…back Addresses grant-tannert's review comments: - failure-modes.md: add cause #3 for multi-group scenario where lower priority number wins; update fix guidance - troubleshooting-guide.md: add group priority comparison step to precedence checking (Step 3) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

t-hsia requested review from a team, IsabellaCapriottiUIPath, ZuerWang99, bansal-anushree, jianjunwang2, litheon and sriramva-uipath as code owners June 23, 2026 18:53

t-hsia and others added 2 commits June 23, 2026 12:04

grant-tannert reviewed Jun 23, 2026

View reviewed changes

grant-tannert approved these changes Jun 23, 2026

View reviewed changes

IsabellaCapriottiUIPath reviewed Jun 23, 2026

View reviewed changes

Comment thread skills/uipath-governance/references/diagnose/references/failure-modes.md

IsabellaCapriottiUIPath reviewed Jun 23, 2026

View reviewed changes

Comment thread skills/uipath-governance/references/diagnose/references/failure-modes.md

IsabellaCapriottiUIPath reviewed Jun 23, 2026

View reviewed changes

t-hsia and others added 2 commits June 23, 2026 15:55

fix(uipath-governance): tags on resource or executable per review

62d1abf

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

t-hsia merged commit dff6585 into main Jun 23, 2026
14 checks passed

t-hsia deleted the feat/governance-path-to-green branch June 23, 2026 23:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(uipath-governance): path to green — operate tests, diagnose structure, description verbs#1651

feat(uipath-governance): path to green — operate tests, diagnose structure, description verbs#1651
t-hsia merged 5 commits into
mainfrom
feat/governance-path-to-green

t-hsia commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

grant-tannert Jun 23, 2026

Uh oh!

grant-tannert Jun 23, 2026

Uh oh!

grant-tannert Jun 23, 2026

Uh oh!

grant-tannert left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IsabellaCapriottiUIPath Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		Symptom: Admin deployed a policy to a tenant (or group), but a specific user still sees the old behavior or no policy at all.

		Causes:

Conversation

t-hsia commented Jun 23, 2026

Summary

Current scores: Build 50%, Operate 30%, Diagnose 15%

Changes

Scorecard impact (expected)

Not in scope

Test plan

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat(uipath-governance): path to green

Summary

Change-by-Change Review

1. skills/uipath-governance/SKILL.md — description update + Troubleshoot section

2. skills/uipath-governance/references/diagnose/CAPABILITY.md — capability index

3. skills/uipath-governance/references/diagnose/references/failure-modes.md — 6 failure patterns

4. skills/uipath-governance/references/diagnose/references/troubleshooting-guide.md — diagnostic ladder

5. tests/tasks/uipath-governance/aops-policy/aops_deploy_tenant_smoke.yaml

6. tests/tasks/uipath-governance/aops-policy/aops_deployed_policy_query_smoke.yaml

7. tests/tasks/uipath-governance/aops-policy/aops_diagnose_policy_not_applied_smoke.yaml

8. tests/tasks/uipath-governance/access-policy/access_evaluate_smoke.yaml

9. tests/tasks/uipath-governance/access-policy/access_diagnose_blocked_invocation_smoke.yaml

What's Missing

Area Ratings

Issues for Manual Review

Conclusion

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coder-eval task lint (advisory)

Evidence of passing run

Per-task lint

tests/tasks/uipath-governance/access-policy/access_diagnose_blocked_invocation_smoke.yaml — verdict: OK

tests/tasks/uipath-governance/access-policy/access_evaluate_smoke.yaml — verdict: OK

tests/tasks/uipath-governance/aops-policy/aops_deploy_tenant_smoke.yaml — verdict: OK

tests/tasks/uipath-governance/aops-policy/aops_deployed_policy_query_smoke.yaml — verdict: Medium

tests/tasks/uipath-governance/aops-policy/aops_diagnose_policy_not_applied_smoke.yaml — verdict: OK

Within-PR duplicates

Conclusion

Uh oh!

grant-tannert Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

grant-tannert Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

grant-tannert Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

grant-tannert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IsabellaCapriottiUIPath Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 23, 2026 •

edited

Loading

1. `skills/uipath-governance/SKILL.md` — description update + Troubleshoot section

2. `skills/uipath-governance/references/diagnose/CAPABILITY.md` — capability index

3. `skills/uipath-governance/references/diagnose/references/failure-modes.md` — 6 failure patterns

4. `skills/uipath-governance/references/diagnose/references/troubleshooting-guide.md` — diagnostic ladder

5. `tests/tasks/uipath-governance/aops-policy/aops_deploy_tenant_smoke.yaml`

6. `tests/tasks/uipath-governance/aops-policy/aops_deployed_policy_query_smoke.yaml`

7. `tests/tasks/uipath-governance/aops-policy/aops_diagnose_policy_not_applied_smoke.yaml`

8. `tests/tasks/uipath-governance/access-policy/access_evaluate_smoke.yaml`

9. `tests/tasks/uipath-governance/access-policy/access_diagnose_blocked_invocation_smoke.yaml`

github-actions Bot commented Jun 23, 2026 •

edited

Loading

`tests/tasks/uipath-governance/access-policy/access_diagnose_blocked_invocation_smoke.yaml` — verdict: OK

`tests/tasks/uipath-governance/access-policy/access_evaluate_smoke.yaml` — verdict: OK

`tests/tasks/uipath-governance/aops-policy/aops_deploy_tenant_smoke.yaml` — verdict: OK

`tests/tasks/uipath-governance/aops-policy/aops_deployed_policy_query_smoke.yaml` — verdict: Medium

`tests/tasks/uipath-governance/aops-policy/aops_diagnose_policy_not_applied_smoke.yaml` — verdict: OK