How Width and Data Shape Generalization Scaling Laws in Quadratic Neural Networks
Evolving story · 1 updatesScaling Laws in Quadratic Neural NetworksTimeline →New research analyzes how model width and dataset size jointly affect generalization in quadratic neural networks, challenging existing scaling law assumptions in feature-learning regimes.
Understanding how performance scales jointly with model size and data is a central problem in modern machine learning. Existing theoretical works on scaling laws typically describe generalization as a function of data or compute, often in fixed-feature or infinite-width regimes and for online SGD. Here, we instead study how generalization scales with the number of trainable parameters and the number of samples in a feature-learning model. We analyze $\ell_2$-regularized empirical test error minimization in a quadratic two-layer network in a finite-sample setting with structured data. This sett
Source: How Width and Data Shape Generalization Scaling Laws in Quadratic Neural Networks. Read the full piece at the source.
Summary and analysis generated by AI (mistral). Always verify against the original sources.