feat(ci3): run uploadable benchmarks on a dedicated on-demand instance (v5-next port) by charlielye · Pull Request #24255 · AztecProtocol/aztec-packages

charlielye · 2026-06-23T18:45:31Z

Ports #24028 (merged to next as 19de9f1551) to the v5 release line.

Clean cherry-pick of the squashed next commit — no conflicts. Brings the dedicated-bench work to v5-next:

Uploadable benchmark runs execute on a dedicated, fixed-type on-demand instance for stable numbers, decoupled from the now-variable (spot-diversified) build hardware.
Single BENCH_UPLOAD flag drives both the dedicated-box launch and the GA publish gate; grind runs (merge-queue-heavy) fire exactly one dedicated box (first instance) while the rest bench inline as a breakage check — de-racing the shared bench-<treehash> upload key.
First-class make bench target (incl. bb-acir so the bb browser-memory bench has its headless-test harness).
bench_engine keeps HT-off + bottom-half scheduling (restored after the dedicated-box change), so timing-sensitive benches don't suffer sibling-thread interference.
bench/next vs bench/prs destinations preserved.

See #24028 for full review history.

> [!IMPORTANT] > Depends on the IAM change aztec-labs-eng/iac#6 (grants `ci3-build-instance-role` the launch/SSM/PassRole surface). **That must apply first**, else the build instance's `create-fleet` hits `UnauthorizedOperation`. ## Problem Spot diversification (create-fleet) means build instances now land on variable EC2 types — m6a/m7a/m6i/r6a/r7a at 16/32/48xlarge, AMD vs Intel. The in-build benchmark phase runs on that box, so wall-time numbers vary by hardware family far more than the 105% regression alert threshold → false regressions. (The instance type isn't even recorded in the bench JSON.) ## Approach Only the canonical **merge-queue→next** series (the one used for real regression tracking) runs benches on a **dedicated, fixed, on-demand m6a.16xlarge**. PR `ci-full` runs keep running benches inline on the contended build box purely as a **breakage check** — no dedicated box, no upload. Benches are scheduled by the existing test engine: when the build completes in `build_and_test` (full builds only), - **upload runs** (`SHOULD_UPLOAD_BENCHMARKS=1`): launch the dedicated box via `./ci.sh bench` as a backgrounded, colored, denoised job (logged like the test engine) and `wait` on it (non-fatal) before returning; - **otherwise**: `bench_cmds >> $test_cmds_file` — benches become ordinary test commands. `ci.sh bench` → `bootstrap_ec2` blocks until the remote `ci-bench` finishes (ending in `cache_upload bench-<treehash>`), so the `wait` is the whole rendezvous. Results reach the GA `Upload benchmarks` step unchanged via that cache key (`ci3_success.sh` `gh-bench`). ## Changes - **`bootstrap.sh`**: drop inline `bench` from `ci-full`/`ci-full-no-test-cache`; add the `build_and_test` launch/append hook + non-fatal `wait`; new `ci-bench` mode = cache-hit `make full` + `bench` (no test engine). - **`ci.sh`**: new `bench` launcher — `AWS_INSTANCE=m6a.16xlarge NO_SPOT=1` (pins a fixed on-demand type; `CPUS` not needed since `AWS_INSTANCE` bypasses pool sizing). - **`ci3/bench_engine`**: drop the 8-core OS isolation / HT-disable / pinning. Dedicated box → benches use the full machine, honouring per-bench `CPUS` via the strict scheduler (defaults to `nproc/2` without `BENCH_CPU_COUNT`). This is what lets the 64-vCPU 16xlarge satisfy the `CPUS=32` bb rollup bench. - **`.github/ci3_labels_to_env.sh`**: scope `SHOULD_UPLOAD_BENCHMARKS` to merge-queue→next (it now also gates the dedicated box). **`ci3/bootstrap_ec2`**: pass it through to the instance. ## Notes - **One-time baseline shift** in `bench/next`: different machine + no isolation changes absolute numbers once; stable thereafter. May want to annotate the series. - **Soft failure**: a bench-box failure is logged and the run proceeds (no fresh numbers) rather than blocking the merge. - **PR benches-as-tests**: `:PARALLEL=0` serial benches lose one-at-a-time isolation and run contended — fine for breakage-only; real numbers come from the dedicated box's `bench_engine` path. - Validated: all touched scripts pass `bash -n`; the `AWS_INSTANCE`+`NO_SPOT` fixed-on-demand launch mechanism was verified live during the create-fleet work. Full e2e is exercised by a merge-queue→next run once the iac PR lands.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci3): run uploadable benchmarks on a dedicated on-demand instance (v5-next port)#24255

feat(ci3): run uploadable benchmarks on a dedicated on-demand instance (v5-next port)#24255
charlielye wants to merge 1 commit into
v5-nextfrom
ci3-dedicated-bench-v5

charlielye commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

charlielye commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant