/python-for-R-users

high level overview of python for R users, data cleaning, preprocessing, modeling, model evaluation

Primary LanguageHTML

python-for-R-users

Date: September 9, 2019 Author: Sylvia Tran

The work in this repository was designed for the LA R Users Group. This repository is intended for a high level overview of python for R users, data cleaning, preprocessing, modeling.

Scope

  • The code provided uses Python 3.7.0
  • Environment setup is not addressed as part of the scope of this repository
  • The work was done on a MacOS, therefore nuances pertaining to Windows OS are not addressed
  • The use of RandomForest is demonstrative, and neither intended to optimize hyperparameters nor minimize loss
  • Forthcoming: R <-> Python Cheatsheet to be added to this repository in the coming weeks in the ./slides-etc/ directory

This Repository:

  • assets (pictures and .mov files for screen capture)
  • notebooks (jupyter notebook (that can be converted to a slide deck))
  • slides (holds slide deck as .html)
  • src (.py file as an example)

Ways to Learn Python:

A. Interactive Python Can be accessed through RStudio using the Terminal by

  1. starting from the working directory of choice
  2. $ ipython

B. Jupyter ipynb (interactive Python notebook)

  1. after downloading the repo, make a copy of the .ipynb file in the /notebooks folder
  2. take apart the code line by line, or go to town on trying different things on the play dataset

Content

  1. Importing Packages
  2. Loading Toy Datasets (sklearn) & using pandas
  3. Cursory Inspection (pandas & numpy)
  4. Light Cleaning (base python, pandas)
  5. Train-test-split (sklearn)
  6. Feature Scaling (sklearn)
  7. Model (sklearn)