-
Dealing with raw data- Here it is shown how to tackle the raw data. Pandas is used for tackling data. Several cool functions like dropna, replace, read_html, astype, get_dummies, to_numeric, fillna, reset_index,.loc, describe, info etc are used
-
Basic and higher dimensionality visualization- Basic but powerful plots like histogram, scatter plot, 3-D plot, parallel_coordinates and andrews_curve is shown here.
-
Unsupervised learning- It contains Working with PCA(principal component analysis) and isomap to find out patterns in the dataset provided
-
supervised learning and clustering- Clustering is also an unsupervised learning type shown here. K_nearest_neighbour and Regression are supervised learning algorithms, implemented here.
-
more classifiers- SVM(support vector machine), Decision tree and Random forest are implemented here as the supervised learning algorithm.
akarshsomani/Working-with-datasets-edx
Basic machine-learning concepts with data set cleaning and processing.
Jupyter Notebook