Skip to content

Releases: Mirmix/nanoGPT-Attention-Routing

Weights, training curves, comparison result

05 Aug 07:38

Choose a tag to compare

This release includes the weights of the baseline and routing model discussed in the README_ROUTING documentation.