Neural networks, deep learning papers
Feedforward Neural Networks (FNN)
Convolutional Neural Networks (CNN)
- One of the papers on convolutional nets - [Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position] (https://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf) (1980) K. Fukushima
- A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu
- Flexible, High Performance ConvolutionalNeural Networks for Image Classification (2011) Dan C. Ciresan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jurgen Schmidhube
Recurrent Neural Networks (RNN)
Unsupervised
- Competitive learning
- Autoencoders
- Modular learning in neural networks (1987) D.H. Ballard
- Extracting and composing robust features with denoising autoencoders (2008) P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol
- From Deep Learning book - Autoencoders (ch. 14) (2016) Ian Goodfellow, Yoshua Bengio, Aaron Courville
- An Introduction to Variational Autoencoders (2019) Diederik P. Kingma, Max Welling
- Contractive Auto-Encoders: Explicit Invariance During Feature Extraction (2011) S. Rifai, P. Vincent, X. Muller, X. Glorot, Y. Bengio
- Deep AutoRegressive Networks (2014) Karol Gregor, Ivo Danihelka, Andriy Mnih, Charles Blundell, Daan Wierstra
- Denoising Autoencoders
- VAE Variational autoencoders
- SOM Self-organizing maps
- Cresceptron (Max-Pooling layers)
Generative Adversarial Networks (GAN)
Bayesian Neural Networks (BNN)
- A Practical Bayesian Framework for Backpropagation Networks (1992) David J. C. MacKay
- Bayesian Learning for Neural Networks (1995) R.M. Neal
- Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks (1995) David J. C. MacKay
- Practical Variational Inference for Neural Networks (2011) Alex Graves
- Weight Uncertainty in Neural Networks (2015) Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra
- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (2016) Y. Gal, Z. Ghahramani
- Stochastic Gradient Descent as Approximate Bayesian Inference (2017) S. Mandt, M.D. Hoffman, D.M. Blei
- Deep neural networks as Gaussian Processes (2018) Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein
- Noisy Natural Gradient as Variational Inference (2018) Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse
- Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam (2018) Mohammad Emtiyaz Khan, Didrik Nielsen, Voot Tangkaratt, Wu Lin, Yarin Gal, Akash Srivastava
- Understanding Priors in Bayesian Neural Networks at the Unit Level (2019) Mariia Vladimirova, Jakob Verbeek, Pablo Mesejo, Julyan Arbel
- Bayesian Deep Learning and a Probabilistic Perspective of Generalization (2020) Andrew Gordon Wilson, Pavel Izmailov
Weightless Neural Networks (WNN)
- Based on Random Access Memory (RAM) nodes
- Advances in Weightless Neural Systems (2014) F.M.G. França, M. De Gregorio, P.M.V. Lima, W.R. de Oliveira
- WiSARD
- PLN Probabilistic Logic Nodes
- GSN Goal Seeking Neurons
- GRAM
Activation functions
- Sigmoid
- HardSigmoid
- SiLU, dSiLU
- Tanh, HardTanh
- Softmax
- Softplus
- Softsign
- ReLU Rectified Linear Unit
- LReLU Leaky ReLU
- PReLU Parametric ReLU
- RReLU Randomized ReLU
- SReLU
- ELU
- PELU
- SELU
- Maxout
- Mish
- Swish
- ELiSH
- HardELiSH
Inference
- Weight guessing
- Vanishing gradient problem (Wiki)
- Double descent
- BP Back-propagation
- Pruning - reduces computational cost, improves generalization
- Optimal Brain Damage (1990) Yann Le Cun, John S. Denker, Sara A. Solla
- Learning both Weights and Connections for Efficient Neural Networks (2015) Song Han, Jeff Pool, John Tran, William J. Dally
- Pruning Convolutional Neural Networks for Resource Efficient Inference (2017) Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz
- Learning Sparse Neural Networks through L0 Regularization (2018) Christos Louizos, Max Welling, Diederik P. Kingma
- Pretraining
- Dropout
Compression
- Knowledge Distillation
- Large neural networks (teacher networks) transfer knowledge to smaller networks (called student networks)
- Neural Network Pruning
- Removing unimportant weights
- Quantization
- Reducing the number of bits used to store the weights
- Software
Star
Issue