Feature Selection by NCA for an SVM classifier

4 views (last 30 days)
Hi
Apparently, 'fscnca' is using a model that is built into a nearest neighbour (NN) classifier. That means the feature weights are calculated based on the performance of a NN classifier. My question is will 'fscnca' be a suitable feature selection tool if one is using another type of classifier such as SVM? MATLAB tutorials has not put such a restriction on using 'fscnca'.
Cheers, Roohollah

Accepted Answer

Carl
Carl on 24 Jul 2017
Edited: Carl on 24 Jul 2017
Hi Roohollah. Technically, there is no guarantee that the feature selection in fscnca will be applicable to an SVM. As you touched on, classifiers like SVMs and k-means are fundamentally different. However, in practice, the "importance" of features can often be generalized, especially if your data can be mapped well to classification by both algorithms. Feature selection with fscnca will most likely be better than no feature selection at all. In fact, the documentation has an example using fitcsvm on features obtained from fscnca:
The Statistics and Machine Learning Toolbox has a variety of functions for both feature extraction and dimensionality reduction:
I would encourage you to try out various approaches and see what works best for you and your specific data.
  2 Comments
Roohollah Milimonfared
Roohollah Milimonfared on 25 Jul 2017
Edited: Roohollah Milimonfared on 25 Jul 2017
Hi Carl Thanks for your time and clarification. Apparently, feature selection algorithms can be divided into supervised and unsupervised (similar to pattern recognition algorithms). For instance, feature selection by PCA or MDS is according to the degree that the features explain the data variance, so they may not be applicable in a problem with pre-labelled observations. However, NCA (and perhaps fisher index) are feature selection algorithms which select features according to their power in discriminating pre-labelled observations. I would be thankful if I could have your thought on that. Cheers, Roohollah
Carl
Carl on 25 Jul 2017
When choosing a feature selection algorithm, choosing supervised vs unsupervised is just one thing you can look out for. See the following documentation on the NCA algorithm:
https://www.mathworks.com/help/stats/neighborhood-component-analysis.html
Like you mentioned, it is supervised. However, I would say that the optimized weights are probably more suitable for something like KNN, rather than an SVM, even though those are both supervised algorithms.
Feature selection of course is also highly dependent on your data, so it may not be the best idea to speculate/generalize on this. I think the best course of action would be to either try various approaches, or see whether each approach is prioritizing features appropriately based on your actual data.

Sign in to comment.

More Answers (0)

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!