/Python_Data_Science_edX_DAT210x

codes for lab assignments and projects

Primary LanguageJupyter NotebookMIT LicenseMIT

Python Data Science, offered by edX (DAT 210x)

course link

my implementations of the lecture and lab assignment materials

covered topics:

  • dataframes from data files, websites, images
  • data cleanup & feature conversion & data slicing & data normalization
  • exploration: histograms, scatter plots, parallel coordinate plot, Andrews curve, correlation matrix
  • dimensionality reduction through PCA & isomap
  • linear regression
  • clustering through KMeans
  • classification through K-Nearest Neighbors, SVC, Decision Tree, Random Forest
  • model evaluation (scores & reports)
  • cross-validation
  • optimize parameters: grid search, randomized search
  • pipeline of estimators

presented in Jupyter notebook