4 links tagged with all of: deep-learning + neural-networks
Click any tag below to further narrow down your results
Links
The article explores the ongoing experiment of scaling deep neural networks, examining how increased parameters, data, and compute affect their learning and performance. It discusses the lack of a mature theoretical framework for understanding these dynamics and introduces the concept of "quanta" as a way to analyze neural scaling. The author reflects on a recent model they developed, considering its implications and limitations.
This article introduces Delta-Delta Learning (DDL), which enhances standard residual networks by applying a rank-1 transformation to the hidden state matrix. The Delta-Res block update combines the removal of old information with the addition of new data, controlled by a gate. Key components include a reflection direction, a value vector, and a gate parameter.
IDInit is a novel initialization method for neural networks that maintains identity transitions within layers, enhancing convergence, stability, and performance during training. By employing a padded identity-like matrix and addressing issues like dead neurons, IDInit offers a straightforward yet effective approach applicable to various deep learning models and large-scale datasets.
Efficient backpropagation (BP) is a fundamental technique in deep learning, first introduced by Seppo Linnainmaa in 1970, building on earlier concepts by Henry J. Kelley in 1960 and others. Despite its origins, BP faced skepticism for decades before gaining acceptance as a viable training method for deep neural networks, which can now efficiently optimize complex models. The article highlights the historical development of BP and addresses misconceptions surrounding its invention and application in neural networks.