Skip to content

fix(dora): use two-phase deployment lookup to prevent first-deploymen…#8942

Open
bujjibabukatta wants to merge 2 commits into
apache:mainfrom
bujjibabukatta:fix/#8790
Open

fix(dora): use two-phase deployment lookup to prevent first-deploymen…#8942
bujjibabukatta wants to merge 2 commits into
apache:mainfrom
bujjibabukatta:fix/#8790

Conversation

@bujjibabukatta

Copy link
Copy Markdown
Contributor

Summary

Fixes the "wrong deployment reference" bug where PRs were incorrectly
associated with old deployments, causing massively skewed median lead
time for change metrics.

Root cause

batchFetchDeployments used only diff-based mapping, which
deliberately excludes the very first deployment in a collected window
(via prev_success_deployment_commit_id <> ''). This guard is correct
for the diff path — without it, the first deployment would absorb every
historical PR via commits_diffs. However, it also blocked a legitimate
case: when a deployment's own commit_sha directly equals a PR's
merge_commit_sha, that association is precise and safe, even for the
first deployment.

This caused the scenario reported in #8790: projects using the GitLab
connector to onboard historical deployments for the first time would see
the first deployment incorrectly pull in all old MRs.

Fix

Two-phase strategy in batchFetchDeployments:

Phase 1 — direct match (new):
Query all successful PRODUCTION deployments in the project, keyed by
commit_sha. If a deployment's commit_sha equals the PR's
merge_commit_sha, associate them. No commits_diffs join needed —
the match is exact, so there is no over-mapping risk.

Phase 2 — diff-based fallback (unchanged):
Retain the existing commits_diffs join with the
prev_success_deployment_commit_id <> '' guard. Only fills in PRs not
already resolved by Phase 1.

Test coverage

All existing e2e test cases in calculate_change_lead_time_test.go
pass without change. In the existing fixtures, PR merge_commit_sha
values are distinct from deployment commit_sha values, so Phase 1
produces no matches and Phase 2 runs identically to before — verifying
no regression.

The bug scenario (Phase 1 actually matching) is the real-world case
described in #8790 and #8188 and is not yet covered by a fixture test.
A follow-up test case with deployment.commit_sha == pr.merge_commit_sha
would be a good addition.

Related issues

Closes #8790
Related: #8188, #8249, #7193

@klesh klesh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks legit to me. Good work.
However, I think we should update the e2e test to include the exact matching case as well.

@bujjibabukatta

bujjibabukatta commented Jun 20, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review, @klesh

Agreed i have added the e2e fixture in commit 4f14dee. It covers the Phase 1 direct-match scenario (deployment.commit_sha == pr.merge_commit_sha) that was the root cause in #8790, which wasn't exercised by the existing test data.

All existing cases still pass, and the new fixture confirms Phase 1 correctly associates the deployment without over-mapping. Let me know if you'd like any changes to the fixture structure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][DORA-change_lead_time_calculator] Wrong deployment reference

2 participants