Home
About me
Research
Publications
Expository Notes
Blogs

Neural Scaling Laws

I try to learn and explain neural scaling law.

Simon Li

Published

03 May 2024

The work identifies four neural scaling regimes. With respect to D, dataset size, and P, number of model parameters; and with respect to variance-limited regime and resolution-limited regime.

Variance-limited regime is when

one of D or P is fixed
study the scaling of the other paremeter to infinitely large
loss scales as $1 / x$, where
- for deep networks, $x = D$ or $x = \sqrt{P} \propto$ width
- for linear models, $x = D$ or $x = P$

while resolution-limited regime is when

one of D or P is infinite
study the scaling of as the other parameter increases
loss scales as $1 / x^\alpha$, with $0 < \alpha < 1$ for both $x = D$ and $x = P$