/Data-Science-Application-Feature-Selection

Feature selection in Data Science Application

Primary LanguageJupyter Notebook

Data-Science-Application-Feature-Selection

Feature Selection for Data-Error-Robustness

A data scientist is driven by business needs. For instance, the ML application requires a certain classification accuracy, low inference time, fairness, and robustness against data errors. Feature selection is one important part of the ML pipeline and in fact, by selecting the right features, the data scientist can ensure to achieve all these ML application constraints. For instance, if the data scientist knows up-front that the data might suffer data quality issues, she can remove those features that impair the model quality the most in case of quality issues.

In order to successfully apply feature selection that yields a model with high data error robustness, we implement:

  • implement various metrics to measure data-error robustness.
  • You implement a well-known feature selection strategy to select features that are data-error robust.
  • implement your own feature selection strategy to improve the current state-ofthe-art in feature selection for data-error robustness.

alt text