cache: elastic bank pipeline with configurable latency by tinebp · Pull Request #371 · vortexgpgpu/vortex

tinebp · 2026-06-18T22:09:44Z

Restructures the cache bank into an elastic pipeline with a configurable hit latency, and folds in the LLC AMO timing fix on top of the AMO write-back work.

What

VX_cache_bank reworked into an elastic (valid/ready) pipeline so per-stage latency is configurable rather than fixed.
AMO commit path timing tightened to close at the target frequency.
simx cache model and unittest updated to match; config/docs updated.

Validation

rtlsim + simx cache/AMO regressions pass.
Bank pipeline latency sweep validated functionally.

🤖 Generated with Claude Code

Add a configurable per-bank pipeline depth (LATENCY) to the cache bank so the whole data-array access can be deferred PIPE_EX=LATENCY-2 register stages past the tag/replacement/MSHR lookup. This breaks the tag-compare -> way -> data critical path that limited Fmax on deep (>64KB) caches, without store->load hazard logic: pipeline order is preserved so each access still sees prior writes. Tags/replacement/MSHR stay at S0/S1 (MSHR allocate->finalize must remain exactly 1 cycle apart). VX_cache_bank is refactored to nested packed structs (req_t/lookup_t/data_t/commit_t) for compactness. LATENCY is threaded bank<-cache<-wrap<-cluster and wired at the L2/L3/D$ instantiation sites from new VX_CFG_{DCACHE,L2,L3}_LATENCY knobs; MREQ_SIZE becomes 2*LATENCY+... to stay a power of 2. SimX L2/L3 pipeline latency now tracks the same knobs. AMO under the deep pipe: the commit ports are re-anchored to the deferred commit stage (stC) with a PIPE_EX-deep commit_busy bridge, and the LLC AMO response is computed in place via a byteen-derived byte mask instead of a >>bit_off / <<bit_off round-trip -- this removes two full-word barrel shifts from the read->response path (the AMO=1 critical path). DUT synth (L2 1MB 8-way, LATENCY=4) on U55C @300MHz: AMO=1 WNS -1.715 -> -0.740ns, AMO no longer the critical path. amo regression 12/12 PASS in rtlsim (matches SimX oracle). Vortex.sv/VX_socket.sv also carry the kmu_arb bus_out_if part-select fix (needed for Vivado elaboration at socket/cluster>1). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

tinebp and others added 2 commits June 18, 2026 15:09

Merge branch 'master' into cache-elastic-bank-pipeline

d3c3c29

tinebp merged commit 0bff367 into master Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cache: elastic bank pipeline with configurable latency#371

cache: elastic bank pipeline with configurable latency#371
tinebp merged 2 commits into
masterfrom
cache-elastic-bank-pipeline

tinebp commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tinebp commented Jun 18, 2026

What

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant