Neural networks, deep learning papers
Feedforward Neural Networks (FNN)
Convolutional Neural Networks (CNN)
  - One of the papers on convolutional nets - [Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position] (https://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf) (1980) K. Fukushima
 
  - A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu
 
  - Flexible, High Performance ConvolutionalNeural Networks for Image Classification (2011) Dan C. Ciresan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jurgen Schmidhube
 
Recurrent Neural Networks (RNN)
Unsupervised
  - Competitive learning
    
  
 
  - Autoencoders
    
      - Modular learning in neural networks (1987) D.H. Ballard
 
      - Extracting and composing robust features with denoising autoencoders (2008) P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol
 
      - From Deep Learning book - Autoencoders (ch. 14) (2016) Ian Goodfellow, Yoshua Bengio, Aaron Courville
 
      - An Introduction to Variational Autoencoders (2019) Diederik P. Kingma, Max Welling
 
      - Contractive Auto-Encoders: Explicit Invariance During Feature Extraction (2011) S. Rifai, P. Vincent, X. Muller, X. Glorot, Y. Bengio
 
      - Deep AutoRegressive Networks (2014) Karol Gregor, Ivo Danihelka, Andriy Mnih, Charles Blundell, Daan Wierstra
 
    
   
  - Denoising Autoencoders
 
  - VAE Variational autoencoders
    
  
 
  - SOM Self-organizing maps
 
  - Cresceptron (Max-Pooling layers)
    
  
 
Generative Adversarial Networks (GAN)
Bayesian Neural Networks (BNN)
  - A Practical Bayesian Framework for Backpropagation Networks (1992) David J. C. MacKay
 
  - Bayesian Learning for Neural Networks (1995) R.M. Neal
 
  - Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks (1995) David J. C. MacKay
 
  - Practical Variational Inference for Neural Networks (2011) Alex Graves
 
  - Weight Uncertainty in Neural Networks (2015) Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra
 
  - Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (2016) Y. Gal, Z. Ghahramani
 
  - Stochastic Gradient Descent as Approximate Bayesian Inference (2017) S. Mandt, M.D. Hoffman, D.M. Blei
 
  - Deep neural networks as Gaussian Processes (2018) Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein
 
  - Noisy Natural Gradient as Variational Inference (2018) Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse
 
  - Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam (2018) Mohammad Emtiyaz Khan, Didrik Nielsen, Voot Tangkaratt, Wu Lin, Yarin Gal, Akash Srivastava
 
  - Understanding Priors in Bayesian Neural Networks at the Unit Level (2019) Mariia Vladimirova, Jakob Verbeek, Pablo Mesejo, Julyan Arbel
 
  - Bayesian Deep Learning and a Probabilistic Perspective of Generalization (2020) Andrew Gordon Wilson, Pavel Izmailov
 
Weightless Neural Networks (WNN)
  - Based on Random Access Memory (RAM) nodes
 
  - Advances in Weightless Neural Systems (2014) F.M.G. França, M. De Gregorio, P.M.V. Lima, W.R. de Oliveira
 
  - WiSARD
 
  - PLN Probabilistic Logic Nodes
 
  - GSN Goal Seeking Neurons
 
  - GRAM
 
Activation functions
  - Sigmoid
 
  - HardSigmoid
 
  - SiLU, dSiLU
 
  - Tanh, HardTanh
 
  - Softmax
 
  - Softplus
 
  - Softsign
 
  - ReLU Rectified Linear Unit
    
  
 
  - LReLU Leaky ReLU
    
  
 
  - PReLU Parametric ReLU
    
  
 
  - RReLU Randomized ReLU
    
  
 
  - SReLU
 
  - ELU
    
  
 
  - PELU
    
  
 
  - SELU
 
  - Maxout
 
  - Mish
    
  
 
  - Swish
 
  - ELiSH
 
  - HardELiSH
 
Inference
  - Weight guessing
 
  - Vanishing gradient problem (Wiki)
 
  - Double descent
    
  
 
  - BP Back-propagation
    
  
 
  - Pruning - reduces computational cost, improves generalization
    
      - Optimal Brain Damage (1990) Yann Le Cun, John S. Denker, Sara A. Solla
 
      - Learning both Weights and Connections for Efficient Neural Networks (2015) Song Han, Jeff Pool, John Tran, William J. Dally
 
      - Pruning Convolutional Neural Networks for Resource Efficient Inference (2017) Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz
 
      - Learning Sparse Neural Networks through L0 Regularization (2018) Christos Louizos, Max Welling, Diederik P. Kingma
 
    
   
  - Pretraining
    
  
 
  - Dropout
    
  
 
Compression
  - Knowledge Distillation
    
      - Large neural networks (teacher networks) transfer knowledge to smaller networks (called student networks)
 
    
   
  - Neural Network Pruning
    
      - Removing unimportant weights
 
    
   
  - Quantization
    
      - Reducing the number of bits used to store the weights
 
    
   
  - Software
    
  
 
      Star
      Issue