TFR optimizations + MX suport by NikhilRout · Pull Request #374 · vortexgpgpu/vortex

NikhilRout · 2026-06-19T07:46:20Z

added fpnew FP16/BF16 FEDP backend
fixed FPGA (fpnew related) makefiles

optimized TFR FEDP uarch by moving part of max_exp identification to stage 2 + avoiding 2s complement with sign popcount trick in stage 3 acc
added dual sparse mask (DSM) generation module before TFR FEDP for dynamic power reduction via clock-gating

added TCU format support for MXFP8, MXBF8, MXFP4, NVFP4, and MXINT8 in simx + rtl + regression test
added MX metadata handling through TCU_LD
split SP/MX TCU SRAM, scoreboard reg, and host runtime/include helpers
WMMA only for now (WGMMA wip)

added ci/regression coverage for MX formats under tensor_mx()
fixed submodule-related actions failures by adding recursive submodule checkout/update and invalidating third_party cache when submodule pins change

tinebp

Can you please add synthesis and performance results of mx benchmarks;
Synthesis: TCU, TCU+MX, TCU+SP+MX+WMMA with NT=NW=16
test results with perf=1 sgemm_tcu, sgemm_tcu_sp, sgemm_tcu_mx, sgemm_tcu_sp_mx with n=64, NT=8, NW=8 for simx and rtlsim, add reports to /perf/tcu/
: include full configuration string and perf1 output to each test result so that we can reproduce it

tinebp · 2026-06-19T10:27:09Z

                            op_args.tcu.is_first_uop = 1'b0;
                            op_args.tcu.is_last_uop  = 1'b0;
-                            wr_xregs[XREG_0] = 1'b1;
+                            wr_xregs[rd[4] ? XREG_TCU_MX : XREG_TCU_SP] = 1'b1;


You need to generalize the special registers; we cannot afford to reserve them for the TCU only.
XREG_2, XREG_3

makes sense. went back to XREG_0 (for SP scoreboard bits) and used XREG_1 (for MX) now

tinebp · 2026-06-19T10:27:52Z

why changing this file?

added tensor_mx() tests

CI was caching third_party based on .gitmodules only instead of the pinned submodule SHAs as well. This was restoring stale third_party contents after a submodule version bump, thereby causing build failures. The new hash makes the cache key depend on the submodule commit pins, and forced recursive update ensures restored caches are corrected before build

tinebp · 2026-06-19T10:28:18Z

why changing this file?

added tensor_mx() tests

CI was caching third_party based on .gitmodules only instead of the pinned submodule SHAs as well. This was restoring stale third_party contents after a submodule version bump, thereby causing build failures. The new hash makes the cache key depend on the submodule commit pins, and forced recursive update ensures restored caches are corrected before build

tinebp · 2026-06-19T10:28:47Z

        .tcu_lmem_if    (tcu_lmem_if),
    `endif
-    `ifdef VX_CFG_TCU_SPARSE_ENABLE
+    `ifdef VX_CFG_TCU_META_ENABLE


excellent idea!!!

tinebp · 2026-06-19T10:30:18Z

    parameter `STRING INSTANCE_ID = "",
    parameter LATENCY = 0,
-    parameter N = TCU_TC_K
+    parameter N = 1,


was indicating how it could be a single-element FEDP (essentially FMA), while TFR is a "genuinely fused" dot product with N=2, similar to how you handled this in the tcu_fedp unittest.

but saying TCU_TC_K does improve readability when connecting fedp backends with VX_tcu_core. rolling back!

NikhilRout added 3 commits June 19, 2026 03:31

added fpnew fedp

e7e37a8

added mx datapath

f3477c5

tfr optimizations

e0dabfc

tinebp reviewed Jun 19, 2026

View reviewed changes

terminology fixes

84c1e56

NikhilRout marked this pull request as draft June 19, 2026 15:17

NikhilRout added 2 commits June 20, 2026 14:34

added sgemm_tcu_sp_mx + SF calc refactoring

1200ccc

added perf/tcu csvs

a55c9d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFR optimizations + MX suport#374

TFR optimizations + MX suport#374
NikhilRout wants to merge 6 commits into
masterfrom
feature_mx

NikhilRout commented Jun 19, 2026

Uh oh!

tinebp left a comment

Uh oh!

tinebp Jun 19, 2026

Uh oh!

NikhilRout Jun 19, 2026 •

edited

Loading

Uh oh!

tinebp Jun 19, 2026

Uh oh!

NikhilRout Jun 19, 2026

Uh oh!

tinebp Jun 19, 2026

Uh oh!

NikhilRout Jun 19, 2026

Uh oh!

tinebp Jun 19, 2026

Uh oh!

tinebp Jun 19, 2026

Uh oh!

NikhilRout Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NikhilRout commented Jun 19, 2026

Uh oh!

tinebp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NikhilRout Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NikhilRout Jun 19, 2026 •

edited

Loading