Skip to content

Robot Framework Selenium vs Playwright PoC#3974

Closed
jstvz wants to merge 12 commits intodevfrom
worktree/robot-poc-comparison
Closed

Robot Framework Selenium vs Playwright PoC#3974
jstvz wants to merge 12 commits intodevfrom
worktree/robot-poc-comparison

Conversation

@jstvz
Copy link
Copy Markdown
Contributor

@jstvz jstvz commented Apr 28, 2026

Summary

  • Add locators_66.py and focused test updates for the API v66 Selenium locator refresh PoC.
  • Expand SalesforcePlaywright.py with 10 Playwright keywords plus an E2E comparison suite.
  • Add ADR 0004 and reproducible Selenium 4 shadow-DOM evidence recommending a time-bounded Playwright migration path.

Test plan

  • uv run pytest cumulusci/robotframework/tests/test_salesforce_locators.py -v
  • uv run cci task run robot --org robot-poc -o suites cumulusci/robotframework/tests/salesforce/locators.robot -o vars BROWSER:headlesschrome
  • uv run cci task run robot --org robot-poc -o suites cumulusci/robotframework/tests/salesforce/ui.robot -o test "Get Related List Count" -o vars BROWSER:headlesschrome
  • Selenium 11-suite battery: 101/102 passed; known remaining failure is Selenium 3 shadow-DOM access to List View Controls.
  • Page object suites: 29/34 passed; known failures are inline Setup/Object Manager locators outside the versioned locator system.
  • Playwright E2E execution is blocked by a pre-existing wait_until_salesforce_is_ready URL regex bug; ADR 0004 calls this out as Phase M1.

Notes

This is intentionally a draft PR for review of the PoC evidence and ADR direction before treating the migration recommendation as accepted.

jstvz added 12 commits April 27, 2026 10:58
Formal spec for the dual-track PoC comparing agent-driven Selenium
locator refresh vs. Playwright keyword port, recovering the Superpowers
loop after partial implementation without spec/review gates.
Bite-sized task plan for the dual-track PoC, organized into 9 tasks
covering Selenium locator refresh, page object fixes, durability audit,
skill spec, Playwright keyword port, E2E test, comparison, ADR, and
roadmap update.
Four locator overrides for API v66 DOM changes:
- actions: added slds-page-header fallback for LWC migration
- app_launcher.current_app: broadened to match h1.appName
- list_view_menu.button: CSS fallback with aria-label (shadow DOM limitation documented)
- record.related.count: multi-strategy OR for LWC related list containers

Test assertion updates:
- TestLibraryA.py: breadcrumb locator uses generic link match
- locators.robot: Object Manager text instead of Mobile Publisher
- test_salesforce_locators.py: added test_locators_66 superset test

Results: 101/102 Selenium tests pass. 1 known failure (forms.robot
radiobutton) due to shadow DOM — Selenium 3 cannot pierce LWC shadow
DOM, documenting as structural ceiling.
Add app launcher, form, modal, and related list keywords to
SalesforcePlaywright.py using Playwright accessibility selectors.
End-to-end test exercising app launcher, form, modal, and record
verification via SalesforcePlaywright keywords.
Evidence-backed ADR from dual-track comparison PoC comparing
agent-maintained Selenium locator refresh against Playwright port.
Recommends dual-track approach (Option 3).
Cast viewport width/height to int in SalesforcePlaywright.open_test_browser.
Update ADR 0004 to note pre-existing wait_until_salesforce_is_ready bug
that prevents Playwright E2E execution.
The test_locators_in_robot_context test hardcoded the expected locator
module name. When not running in a Robot context, the library falls
back to the highest-numbered locator file. Use dynamic discovery
(same pattern as test_locators_outside_robot_context) instead.
…ation

Replace dual-track recommendation with full Playwright migration plus
time-bounded Selenium deprecation period. Adds:

- Empirical Selenium 4 shadow DOM verification: 6-hop nesting, 452 shadow
  roots per page, 14-line traversal vs 1-line Playwright equivalent
- Effort comparison from PoC (~25x lines/min on tested surface, with
  sample-size caveat that harder keywords likely 3-5x)
- Per-release maintenance trajectory comparison (Selenium worsening,
  Playwright effectively flat)
- 5-phase migration path: unblock Playwright, port keywords, compat
  shim, downstream coordination, deprecation/removal
- Selenium 4 added as Option 4 with empirical evidence showing it does
  not solve the shadow DOM problem at scale

Verification scripts and findings stored under
docs/superpowers/evidence/2026-04-27-robot-poc/ for reproducibility.
Address all 4 [MAJOR] and 4 [MINOR] reviewer issues:

MAJOR:
- Move evidence files from gitignored docs/superpowers/evidence/ to
  docs/adrs/0004-evidence/ for external verifiability
- Surface "no Playwright E2E test passed during PoC" as Important caveat
  callout at the top of the Evidence section, plus reflected in Track B
  table and Risk consequences
- Tighten 25x framing: explicit rate-vs-ratio language, ~20x ratio on
  tested surface, defensible 3-5x estimate for full port with stated
  methodology
- Add single-sample caveat to Selenium 4 section: one element measured;
  qualitative architectural claim (LWC host names brittle as Aura)
  generalizes, specific 14-statement cost is a single data point

MINOR:
- Quantify Phase M4 downstream PR burden honestly: 6-12 month tail across
  NPSP/EDA/OFM/V4S, capacity is binding constraint
- Surface compatibility-shim vs runtime-translator cost asymmetry:
  shim is days/weeks, runtime translator is weeks; defer until usage data
- Trim methodology prose to one paragraph in active voice with named
  artifacts and links
- Fix frontmatter author syntax to "@jstvz" matching ADRs 0002, 0003
- Add Option 5 (reduce surface) and Option 6 (hybrid: new Playwright,
  freeze Selenium) with dismissal rationale

NIT:
- Spell out 14-statement math decomposition explicitly
- Cite selenium pin location (pyproject.toml lines 50, 54)
- Beef up References with source-file links (locators_66.py,
  SalesforcePlaywright.py, Salesforce.py, e2e_comparison.robot,
  pyproject.toml)
- Clarify "452 shadow roots" framing: count of hosts, not unreachable
  elements

Plus: TL;DR at top per reviewer's optional improvement.
Round 2 review APPROVED with 3 residual NITs and 3 optional improvements.
Address all six in one tight cleanup:

NITs:
- Align findings.md to use "statements" not "lines" for the ~14 figure
  (matches body of ADR after round 1 wording change)
- Add (per single-element measurement; see body) caveat to TL;DR's 14x
  verbosity claim
- Sketch the 3-5x calibration: lower bound assumes harder keywords are
  Python-control-flow-dominated and roughly equal cost in either lib;
  upper bound assumes Playwright still wins on selector strategy and
  auto-wait

Optional:
- Phase M2 commits to a follow-up ADR for realized numbers, addressing
  the round 1 immutability concern more durably
- Track A page object row clarifies "(34 cases across 4 page-object
  suites)"
- Pull no-E2E-test-passed into its own Risk bullet (was nested inside
  sample-size-bias risk) for parity with top-of-evidence callout
- Tighten findings.md cost-ratio paragraph to honour the single-sample
  caveat (was "anything depth-bound pays this 14x verbosity tax";
  now scoped to the measured element with explicit
  did-not-measure-distribution disclosure)
These spec/plan artifacts are internal working notes and should not be
included in the public PR. ADR 0004 now carries the durable evidence and
links to committed ADR evidence files instead.
@jstvz
Copy link
Copy Markdown
Contributor Author

jstvz commented Apr 28, 2026

Closing temporarily to rewrite branch history and remove internal docs/superpowers artifacts from the pushed commit history.

@jstvz jstvz closed this Apr 28, 2026
@jstvz
Copy link
Copy Markdown
Contributor Author

jstvz commented Apr 28, 2026

Reopened after rewriting branch history from dev. The branch now contains only the public PoC/ADR artifacts and no docs/superpowers files in the final diff or branch history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant