Commit dc81cb3
committed
fix: ensure non-zero learning rate during warmup at iteration 0
The warmup learning rate calculation has been modified to use (it + 1)/(warmup_iters + 1)
instead of it/warmup_iters. This ensures a non-zero learning rate at iteration 0
while maintaining the same linear warmup behavior.
Fixes karpathy#4431 parent 9755682 commit dc81cb3
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
234 | | - | |
| 234 | + | |
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
| |||
0 commit comments