docs: address residual NITs from ADR 0004 round 2 review

jstvz · jstvz · commit 7d43d6725da3 · 2026-04-28T01:26:14.000-07:00
Round 2 review APPROVED with 3 residual NITs and 3 optional improvements.
Address all six in one tight cleanup:

NITs:
- Align findings.md to use "statements" not "lines" for the ~14 figure
  (matches body of ADR after round 1 wording change)
- Add (per single-element measurement; see body) caveat to TL;DR's 14x
  verbosity claim
- Sketch the 3-5x calibration: lower bound assumes harder keywords are
  Python-control-flow-dominated and roughly equal cost in either lib;
  upper bound assumes Playwright still wins on selector strategy and
  auto-wait

Optional:
- Phase M2 commits to a follow-up ADR for realized numbers, addressing
  the round 1 immutability concern more durably
- Track A page object row clarifies "(34 cases across 4 page-object
  suites)"
- Pull no-E2E-test-passed into its own Risk bullet (was nested inside
  sample-size-bias risk) for parity with top-of-evidence callout
- Tighten findings.md cost-ratio paragraph to honour the single-sample
  caveat (was "anything depth-bound pays this 14x verbosity tax";
  now scoped to the measured element with explicit
  did-not-measure-distribution disclosure)
diff --git a/docs/adrs/0004-evidence/findings.md b/docs/adrs/0004-evidence/findings.md
@@ -20,7 +20,7 @@
 | Shadow boundary depth from button to `<body>`                  | **6 hops**                                                                                                                                                                  |
 | Host chain (button → outer)                                    | `lightning-button-menu` → `lst-list-view-manager-settings-menu` → `lst-list-view-manager-header` → `lst-common-list-internal` → `lst-list-view-manager` → `lst-object-home` |
 | Outermost host (`lst-object-home`) findable in light DOM?      | **Yes (1 match)**                                                                                                                                                           |
-| Selenium 4 chained `shadow_root` traversal feasible?           | Yes, but requires **7 `find_element` + 6 `shadow_root` accesses** (~14 lines of Python per element)                                                                         |
+| Selenium 4 chained `shadow_root` traversal feasible?           | Yes, but requires **7 `find_element` + 6 `shadow_root` accesses + 1 final action = ~14 statements** of Python per element                                                   |
 | Playwright equivalent                                          | `page.get_by_role("button", name="List View Controls").click()` — 1 line, auto-pierces all 6 boundaries                                                                     |
 
 ## Implications
@@ -32,13 +32,13 @@
 
 ## Cost ratio for shadow-DOM-bound elements
 
-| Path                                                | Lines per element | Stability of selectors                  | Per-version maintenance        |
-| --------------------------------------------------- | ----------------- | --------------------------------------- | ------------------------------ |
-| Selenium 3 (current)                                | N/A — unfixable   | N/A                                     | Growing failures every release |
-| Selenium 4 chained traversal                        | ~14               | LWC-internal names (brittle, like Aura) | Per-release rewrites likely    |
-| Playwright (`get_by_role`, `text=`, `data-testid=`) | 1                 | ARIA / SLDS public contract             | Near-zero                      |
+| Path                                                | Statements per element | Stability of selectors                  | Per-version maintenance        |
+| --------------------------------------------------- | ---------------------- | --------------------------------------- | ------------------------------ |
+| Selenium 3 (current)                                | N/A — unfixable        | N/A                                     | Growing failures every release |
+| Selenium 4 chained traversal                        | ~14                    | LWC-internal names (brittle, like Aura) | Per-release rewrites likely    |
+| Playwright (`get_by_role`, `text=`, `data-testid=`) | 1                      | ARIA / SLDS public contract             | Near-zero                      |
 
-For the Account list view alone, with 452 shadow roots, anything that depends on a shadow-DOM-bound element pays this 14× verbosity tax under Selenium 4.
+For this single measured element, the Selenium 4 path is ~14× more verbose than the Playwright path. The 452 shadow-host count above describes how heavily Lightning is LWC-componentized on this page; it is a count of shadow hosts, not of unreachable elements. Other shadow-DOM-bound elements in the test suite may be 1–2 hops shallow or 6+ hops deep — we did not measure the distribution.
 
 ## Reproducibility
 
diff --git a/docs/adrs/0004-robot-framework-selenium-vs-playwright.md b/docs/adrs/0004-robot-framework-selenium-vs-playwright.md
@@ -8,7 +8,7 @@ author: "@jstvz"
 
 ## TL;DR
 
-Migrate CumulusCI's Robot Framework browser-test infrastructure from Selenium 3 to `robotframework-browser` (Playwright) over a time-bounded deprecation period (Phases M1–M5, expected 6–12 months for the downstream-coordination tail). Selenium 4 is no longer required. The `sf:` locator prefix and `Salesforce.robot` resource remain available during deprecation; downstream consumers (NPSP, EDA, OFM, V4S) migrate on their own schedule with tooling support. Selenium per-release maintenance is tractable today (4 locator overrides bridged 10 API versions) but its shadow-DOM trajectory is structurally bad and Selenium 4 only shifts the brittleness from Aura to LWC at ~14× the verbosity.
+Migrate CumulusCI's Robot Framework browser-test infrastructure from Selenium 3 to `robotframework-browser` (Playwright) over a time-bounded deprecation period (Phases M1–M5, expected 6–12 months for the downstream-coordination tail). Selenium 4 is no longer required. The `sf:` locator prefix and `Salesforce.robot` resource remain available during deprecation; downstream consumers (NPSP, EDA, OFM, V4S) migrate on their own schedule with tooling support. Selenium per-release maintenance is tractable today (4 locator overrides bridged 10 API versions) but its shadow-DOM trajectory is structurally bad and Selenium 4 only shifts the brittleness from Aura to LWC at ~14× the verbosity (per single-element measurement; see body).
 
 ## Context and Problem Statement
 
@@ -49,7 +49,7 @@ We need to decide CumulusCI's Robot Framework path: continue Selenium with agent
 | Versioned locator overrides added ([`locators_66.py`](../../cumulusci/robotframework/locators_66.py)) | **4** (`actions`, `app_launcher.current_app`, `list_view_menu.button`, `record.related.count`)                                                                                                                                                                   |
 | Selenium test pass rate (11 suites)                                                                   | **101 / 102**                                                                                                                                                                                                                                                    |
 | Unfixable failure                                                                                     | **1** — `forms.robot::radiobutton` (List View Controls in shadow DOM)                                                                                                                                                                                            |
-| Page object pass rate (4 suites)                                                                      | **29 / 34**                                                                                                                                                                                                                                                      |
+| Page object pass rate                                                                                 | **29 / 34** (34 cases across 4 page-object suites)                                                                                                                                                                                                               |
 | Page object failures                                                                                  | **5** — inline locators in [`ObjectManagerPageObject.py`](../../cumulusci/robotframework/pageobjects/ObjectManagerPageObject.py) (Save button changed `<input>`→`<button>`, sidebar link text changed). These are inline locators outside the versioning system. |
 | Locator durability audit                                                                              | 14 / 41 (34%) reference Aura internals (`uiModal`, `oneActionsRibbon`, `forceFormPageError`, `force_relatedListContainer`); 14 / 41 (34%) use SLDS-stable references; 13 / 41 (32%) ARIA.                                                                        |
 | Surface area                                                                                          | 41 versioned locators, ~37 keywords in `Salesforce.py`, page objects, form_handlers dispatch — full system to maintain.                                                                                                                                          |
@@ -98,7 +98,7 @@ All numbers below are from direct observation; they give an accurate sense of sc
 1. **The two rates are not directly comparable.** Track A's lines are mostly XPath fragments and test fixtures debugged against a live org. Track B's lines are mostly keyword bodies and docstrings written from scratch. Lines-per-minute is a rough engineering proxy, not a precise metric.
 2. **Sample-size bias.** The 10 ported keywords are the easier surface (modals, app launcher, simple form fill). The harder surface — page object model, [`form_handlers.py`](../../cumulusci/robotframework/form_handlers.py) dispatch table, label-locator strategy, related-list popups — was not ported. Those keywords contain non-trivial Python logic that doesn't get cheaper just because the underlying browser library changed.
 
-A defensible engineering estimate for the **full port** is **3–5× faster than equivalent Selenium maintenance**, not 20×. This is not a measured number; it is a calibrated estimate based on the complexity of the unported surface. Phase M2 will produce real per-keyword data that should refine this estimate (and we will publish a follow-up ADR with realized numbers — see Consequences).
+A defensible engineering estimate for the **full port** is **3–5× faster than equivalent Selenium maintenance**, not 20×. This is not a measured number; it is a calibrated estimate based on the complexity of the unported surface. The lower bound (3×) assumes the harder keywords are dominated by Python control flow (page object model, `form_handlers` dispatch, label strategy) that costs roughly the same in either library; the upper bound (5×) assumes Playwright still wins on selector strategy and auto-wait even where the Python logic is comparable. Phase M2 will produce real per-keyword data that should refine this estimate (and we will publish a follow-up ADR with realized numbers — see Consequences).
 
 ### Per-release maintenance trajectory
 
@@ -200,6 +200,7 @@ Phased plan with realistic durations.
 -   Port the remaining ~27 keywords from `Salesforce.py` (the harder surface: page object model, `form_handlers` dispatch, label strategy, related-list popups, performance keywords)
 -   Each PR ports a coherent group, includes a Playwright-side test, and is reviewed independently
 -   Track per-keyword effort to refine the 3–5× estimate
+-   Outcomes (realized full-port cost, any unforeseen complexity) will be published as a follow-up ADR rather than amending this one — ADRs are durable records by design
 
 **Phase M3 — Compatibility surface for downstream (~2–4 weeks for shim; runtime translator deferred unless data warrants)**
 
@@ -235,7 +236,8 @@ Phased plan with realistic durations.
 -   **Negative:** Significant one-time migration cost — ~27 remaining keywords, downstream coordination, compatibility shim or migration tooling.
 -   **Negative:** Breaking change for downstream consumers, even with a deprecation window. Coordination effort with NPSP, EDA, OFM, V4S maintainers measured in months, not weeks.
 -   **Negative:** Adds Node.js + Playwright binary dependency for users who run browser tests.
--   **Risk:** Sample-size bias — the 10-keyword PoC tested the easy surface and no Playwright E2E test passed end-to-end. The harder keywords (page object model, form_handlers) and the runtime infrastructure may surface unforeseen complexity. **Mitigation:** Phase M1 establishes runtime validity before Phase M2 commits to the full port; Phase M2 is structured as ~5 independent PRs, each with its own validation, so the cost gets revealed incrementally rather than as a big-bang surprise.
+-   **Risk:** No Playwright end-to-end test passed during the PoC. The 10 ported keywords were validated by static review and selector-strategy analysis only; runtime correctness rests on Phase M1's regex bug fix unlocking E2E execution. **Mitigation:** Phase M1 is small, well-scoped, and runs first; if it surfaces deeper infrastructure problems, the migration plan can be revisited before committing to Phase M2.
+-   **Risk:** Sample-size bias — the 10-keyword PoC tested the easy surface. The harder keywords (page object model, form_handlers, label strategy, related-list popups) may surface unforeseen complexity in runtime behaviour, not just selector strategy. **Mitigation:** Phase M2 is structured as ~5 independent PRs, each with its own validation, so the cost gets revealed incrementally rather than as a big-bang surprise.
 -   **Risk:** The 3–5× full-port efficiency estimate is calibrated, not measured. **Mitigation:** Phase M2 produces real per-keyword data; we will publish a follow-up ADR (or a supersedes-style update) with the realized numbers once the port is complete.
 -   **Risk:** Downstream maintainers may not have capacity to migrate within the deprecation window. **Mitigation:** long deprecation window (≥1 major version + the realized M4 tail), `SalesforceCompat.robot` shim, automated migration tooling, willingness to extend the EOL version if needed.
 -   **Risk:** Selenium 4 measurement was a single sample on a single element. **Mitigation:** the qualitative architectural claim (LWC host-chain names are implementation details and as brittle as Aura) generalizes from the host-chain composition, not from the specific 14-statement cost; the decision does not hinge on the precise number.