/Working-with-datasets-edx

Basic machine-learning concepts with data set cleaning and processing.

Primary LanguageJupyter Notebook

Working-with-datasets-edx

  1. Dealing with raw data- Here it is shown how to tackle the raw data. Pandas is used for tackling data. Several cool functions like dropna, replace, read_html, astype, get_dummies, to_numeric, fillna, reset_index,.loc, describe, info etc are used

  2. Basic and higher dimensionality visualization- Basic but powerful plots like histogram, scatter plot, 3-D plot, parallel_coordinates and andrews_curve is shown here.

  3. Unsupervised learning- It contains Working with PCA(principal component analysis) and isomap to find out patterns in the dataset provided

  4. supervised learning and clustering- Clustering is also an unsupervised learning type shown here. K_nearest_neighbour and Regression are supervised learning algorithms, implemented here.

  5. more classifiers- SVM(support vector machine), Decision tree and Random forest are implemented here as the supervised learning algorithm.