Dirty-dataImpacts

Codes&Datasets Codes: Contain 6 Classification algos; 6 Clustering algos; 5 Regression algos. All codes are written in C++. ps. LogisticRegression is used for both Classification and Regression. Datasets: Contain 5 Classification original datasets; 5 Clustering original datasets; 5 Regression original datasets. Dirty data are injected into Original Datasets: Contain Missing Data; Inconsistency Data; Conflict Data. Missing rate vaires from 10% to 50%; Inconsistency rate varies from 10% to 50%; Conflict rate vaires from 10% to 50%. If you have any question, please email to zhixin.qi@foxmail.com. Enjoy it!