/DataAnalysis

🌸DataAnalysis

Primary LanguageJupyter Notebook

DataAnalysis

This program offers the interface for data analytics, most of them use the cross-validation and grid-search to find the optimal parameters. In this way, you just only adjust a little bit of code to fit your model. Meanwhile, this program can be as a practice for anyone to learn the machine learning.

Model

Classification

Logistic Regression | Random Forest | SVM | Distribution | KNN | XGBoost | KNN | classificationMetrics |

Regression

Linear Regression |

StatisticModel (from 'statsmodels')

Simple Exponential Smoothing | Holt’s linear Exponential Smoothing | Holt-Winter Smoothing | ARIMA | SARIMAX |

text2vec

word2vec | word2tfIDF | doc2vec |

abNoramalDetect

Apriori

Cluster

K-mean | DBSCAN |

dimensionReduction

PCA | tSNE |

Evaluation

drawLine | learningLine | SegWordEvaluation |

datePretreatment

dataEncoder(simple)

Binning | Label Encoder | One Hot | WoE(include IV filter) |

discretization

outlierDection

sampling

SimpleSampling | SystematicSampling | StratifiedSampling | ClusterSampling |

SmoothOnehot

Unbalanced Dataset filling

SMOTE

Standardization

ZScore | MaxMin | MaxAbsScaler | RobustScaler |

Exploratory Data Analysis

Figure Exploratory

Value Exploratory

dataSimpleReading | dataStringCount | label_samples_summary | relatedAnalysisReading |

dataWordCloud