Feature selection (variable selection)

Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction (Wikipedia)

Why feature selection?

Data exploration
Curse of dimensionality
Less features - faster models
Better metrics

Overview
- An Introduction to Variable and Feature Selection (2003) Isabelle Guyon, Andre Elisseeff
- A Survey on Feature Selection (2016) Jianyu Miaoac, Lingfeng Niu
- Feature Selection: A Data Perspective (2016) Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, Huan Liu
- Feature Selection and Feature Extraction in Pattern Analysis: A Literature Review (2019) Benyamin Ghojogh, Maria N. Samad, Sayema Asif Mashhadi,Tania Kapoor, Wahab Ali, Fakhri Karray, Mark Crowle
All-relevant vs minimal-optimal feature selection
- Consistent Feature Selection for Pattern Recognition in Polynomial Time (2007) R. Nilsson, J. M. Peña, J. Björkegren, J. Tegnér

Filter methods

Filter methods use model-free ranking to filter less relevant features

Missing Values Ratio
- Removing features with a ratio of missing values greater than some threshold
Low Variance Filter (sklearn)
- Removing features with a variance lower than some threshold
Correlation (Wiki)
χ² Chi-squared statistic for categorical features (Wiki, sklearn)
ANOVA F-value for quantitative features(Wiki, sklearn)
Mutual information (Wiki)
- Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection (2012) Gavin Brown, Adam Pocock, Ming-Jie Zhao, Mikel Lujan
- Feature Selection Based on Joint Mutual Information (1999) Howard Hua Yang, John Moody
- Estimating mutual information (2003) Alexander Kraskov, Harald Stoegbauer, Peter Grassberger
mRMR Minimum redundancy, maximal relevancy (Link, Wiki)
- Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy (2005) Hanchuan Peng, Fuhui Long, Chris Ding
Relief (Wiki)
- The Feature Selection Problem: Traditional Methods and a New Algorithm (1992) Kira Kenji, Larry Rendell
- Relief-Based Feature Selection: Introduction and Review (2018) Ryan J. Urbanowicz, Melissa Meeker, William LaCava, Randal S. Olson, Jason H.Moore
Markov Blanket (Wiki)
- Markov Blanket based Feature Selection: A Review of Past Decade (2010) Shunkai Fu, Michel C. Desmarais
- Incremental Association Markov Blanket: Algorithms for Large Scale Markov Blanket Discovery (2003) Ioannis Tsamardinos, Constantin F. Aliferis, Alexander Statnikov
- Grow-Shrink algorithm: Bayesian Network Induction via Local Neighborhoods (2000) Dimitris Margaritis, Sebastian Thrun
- Koller-Sahami method: Toward Optimal Feature Selection (1996) Daphne Koller and Mehran Sahami
- Max-Min Markov Blanket: Time and Sample Efficient Discovery of Markov Blankets and Direct Causal Relations (2003) Ioannis Tsamardinos, Constantin F. Aliferis, Alexander Statnikov
Fast Correlation-based Filter
- Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution (2003) Lei Yu, Huan Liu
CBF Consistency-Based Filters
- Consistency-based search in feature selection (2003) Manoranjan Dasha, Huan Liu
Interact
- Searching for Interacting Features (2007) Zheng Zhao, Huan Liu

Wrapper methods

Wrapper methods use a model and its performance to find the best feature subset

SFS Sequential Feature Selection
SFFS Sequential Floating Forward Selection
- Floating search methods in feature selection (1994) Pavel Pudil, Josef Kittler, Jana Novovicová
- Adaptive floating search methods in feature selection (1999) P. Somol , Pavel Pudil , Jana Novovicova , P. Paclik
Genertic algorithm (Wiki)
PSO Particle Swarm Optimization (Wiki)
- Particle Swarm Optimization (1995) James Kennedy, Russell Eberhar
- Feature Selection using PSO-SVM (2007) Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, Cheng-Hong Yang
Boruta All-relevant feature selection (CRAN, PyPI)
- Boruta – A System for Feature Selection (2010) Miron B. Kursa, Aleksander Jankowski, Witold R. Rudnick
- BoostARoota - Boruta with XGBoost as a base model (Code)
MUVR (GitLab)
- Variable selection and validation in multivariate modelling (2018) Lin Shi, Johan A Westerhuis, Johan Rosén, Rikard Landberg, Carl Brunius
Wrappers methods and overfitting:
- Wrappers for feature subset selection (1996) Ron Kohavi, George H. John

Embedded methods

LASSO
- Regression Shrinkage and Selection via the lasso (1996) Robert Tibshirani
Elastic net
- Regularization and variable selection via the elastic net (2005) Hui Zou, Trevor Hastie
Spike and Slab regression (Wiki)
- Bayesian variable selection in linear regression T.J. Mitchell, J.J. Beuchamp
- Approaches for Bayesian variable selection (1997) Edward I. George, Robert E. McCulloch
Decision Tree (Wiki)
Random Forest (Wiki)
- Random Forests (2001) Leo Breiman
- Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatic (2012) Anne-Laure Boulesteix, Silke Janitza, Jochen Kruppa, Inke R. Konig
- Variable selection using random forests (2010) Robin Genuer, Jean-Michel Poggi, Christine Tuleau-Malot
- Bias in random forest variable importance measures: Illustrations, sources and a solution (2007) Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis, Torsten Hothorn
- Conditional Variable Importance for Random Forests (2008) Carolin Strobl, Anne-Laure Boulesteix, Thomas Kneib, Thomas Augustin, Achim Zeileis
- Correlation and variable importance in random forests (2016) Baptiste Gregorutti, Bertrand Michel, Philippe Saint-Pierre
Gradient Boosting (Wiki)
- Greedy Function Approximation: A Gradient Boosting Machine (1999) Jerome H Friedman
- Boosting Algorithms as Gradient Descent (1999) Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean

Unsupervised and semi-supervised feature selection

FSSEM Feature Subset Selection using Expectation-Maximization
- Feature Selection for Unsupervised Learning (2004) Jennifer G. Dy, Carla E. Brodley
Laplacian Score
- Choosing features using a nearest neighbor graph
- Laplacian Score for Feature Selection (2005) Xiaofei He, Deng Cai, Deng Cai, Partha Niyogi, Partha Niyogi
Principal Feature Analysis
- Feature Selection Using Principal Feature Analysis (2007) Yijuan Lu, Ira Cohen, Xiang Sean Zhou, Qi Tian
Spectral Feature Selection
- Separates samples into clusters using a spectrum of pairwise similarity graph
- Spectral Feature Selection forSupervised and Unsupervised Learning (2007) Zheng Zhao, Huan Liu
MCFS Multi-cluster Feature Selection
- Unsupervised Feature Selection for Multi-Cluster Data (2010) Deng Cai, Chiyuan Zhang, Xiaofei He
Autoencoders (Wiki)
- Autoencoders, Unsupervised Learning, and Deep Architectures (2012) Pierre Baldi
- An Introduction to Variational Autoencoders (2019) Diederik P. Kingma, Max Welling
- Concrete Autoencoders for Differentiable Feature Selection and Reconstruction (2019) Abubakar Abid, Muhammad Fatih Balin, James Zou

Stable feature selection

Stability of Feature Selection Algorithms: a study on high dimensional spaces (2007) Alexandros Kalousis, Julien Prados, Melanie Hilario
Robust Feature Selection Using Ensemble Feature Selection Techniques (2008) Yvan Saeys, Thomas Abeel, Yves Van de Pee
Stability Selection (2009) Nicolai Meinshausen, Peter Buhlmann
A Novel Weighted Combination Method for Feature Selection using Fuzzy Sets (2020) Zixiao Shen, Xin Chen, Jonathan M. Garibald
Stability of MDA, LIME and SHAP: The best way to select features (2020) Xin Man, Ernest P. Chan

Domain-specific

Uplift models
- Feature Selection Methods for Uplift Modeling (2020) Zhenyu Zhao, Yumin Zhang, Totte Harinen, Mike Yung

Meta feature selection

A Feature Subset Selection Algorithm AutomaticRecommendation Method (2013) Guangtao Wang, Qinbao Song, Heli Sun, Xueying Zhang, Baowen Xu, Yuming Zhou
Metalearning for Choosing Feature Selection Algorithms in Data Mining: Proposal of a New Framework (2017) Antonio Rafael Sabino Parmezan, Huei Diana Lee
A Novel Meta Learning Framework for Feature Selection using Data Synthesis and Fuzzy Similarity (2020) Zixiao Shen, Xin Chen, Jonathan M. Garibald

Packages

R
- Package: fscaret (CRAN) Jakub Szlek
- Package: praznik (Code) Miron Kursa
- Package: FSinR (CRAN, Paper) Francisco Aragón-Royón, Alfonso Jiménez-Vílchez, Antonio Arauzo-Azofra, José Manuel Benítez
- Package: VSURF (CRAN, Paper)
- Package: spikeSlabGAM (Code, CRAN, Paper)
- Package: copent (CRAN, Code, Paper)
Python
- Package: sklearn.feature_selection (Homepage, Code, PyPI)
- Package: scikit-feature (Homepage, Code)
- Package: feature-selector (Code, PyPI)
Julia
- the main packages for ML in Julia are MLJ and Flow

Star Issue