/CE9010_2018

Python notebooks and slides for CE9010: Introduction to Data Science, Semester 2 2017/18

Primary LanguageJupyter NotebookMIT LicenseMIT

CE9010: Introduction to Data Science
Semester 2 2017/18
Xavier Bresson



Slides of the course




Python notebooks of the course




Student Projects


  • [Notebook, Slides] How Close Are NBS Students To Their Classmates? Cheng Jin Yee (Jinny), Jeremy Jerome Chia
  • [Notebook, Slides] A Look into rental fee on PropertyGuru, Chong Ke Xin
  • [Notebook, Slides] Gender Prediction Based on Profile Photo, Chen Zitong, Jin Ye, Xiao Fengtong
  • [Notebook, Slides] Predicting Success in the NBA, Cai Xin Qing Yeo Ngee Chong
  • [Notebook, Slides] Predicting HDB Resale Prices in Singapore, Thomas ten Hacken, Maxime Kayser, Mei-Jun Yeh


Running Python notebooks without local Python installation


    Run the notebooks from the cloud using Binder: Simply click here.



Local Python installation


Follow the following instructions to install Miniconda and create a Python environment for the course:

  1. Download the Python 3.6 installer for Windows, macOS, or Linux from https://conda.io/miniconda.html and install with default settings. Note for Windows: If you don't know if your operating system is 32-bit or 64-bit, then open Settings-System-About-System type to find out your xx-bit system.

    • Windows: Double-click on the Miniconda3-latest-MacOSX-x86_64.exe file.
    • macOS: Run bash Miniconda3-latest-MacOSX-x86_64.sh in your terminal.
    • Linux: Run bash Miniconda3-latest-Linux-x86_64.sh in your terminal.
  2. Windows: Open the Anaconda Prompt terminal from the Start menu. MacOS, Linux: Open a terminal.

  3. Install git: conda install git.

  4. Download the GitHub repository of the course: git clone https://github.com/xbresson/CE9010_2018.

  5. Go to folder CE9010_2018 with cd CE9010_2018, and create a Python virtual environment with the packages required for the course: conda env create -f environment.yml. Note that the environment installation may take some time.

    Notes:
    The installed conda packages can be listed with conda list.
    Some useful Conda commands are pwd, cd, ls -al, rm -r -f folder/
    Add a python library to the Python environment: conda install -n CE9010_2018 numpy (for example)
    Read Conda command lines for packages and environments
    Read managing Conda environments



Running local Python notebooks


First time:

  1. Windows: Open the Anaconda Prompt terminal from the Start menu. MacOS, Linux: Open a terminal.

  2. Activate the environment. Windows: activate CE9010_2018, macOS, Linux: source activate CE9010_2018.

  3. Start Jupyter with jupyter notebook. The command opens a new tab in your web browser.

  4. Go to the folder tutorials and duplicate the original notebook tutorial01.ipynb with a new name my_tutorial01.ipynb (for example) to avoid future conflicts, see understanding git conflicts.

  5. Open, edit and run the notebook my_tutorial01.ipynb from your browser.

  6. When your tutorial is completed, you can go back to the terminal command by shutting down the juypter kernels with Control-C.

  7. Save your notebook with git: git add ., and git commit -m tutorial01.

    Notes:
    Windows: Folder CE9010_2018 is located at C:\Users\user_name\CE9010_2018. MacOS, Linux: /Users/user_name/CE9010_2018.
    Check the status of your git folder: git status
    List of git commands
    Windows systems may print the warning: LF will be replaced by CRLF when git commit. You can deactivate this warning with git config core.autocrlf false.



The next times:

  1. Windows: Open the Anaconda Prompt terminal from the Start menu. MacOS, Linux: Open a terminal.
  2. Activate the environment. Windows: activate CE9010_2018, macOS, Linux: source activate CE9010_2018.
  3. Download the new Python notebooks: Go to folder CE9010_2018 with cd CE9010_2018, and git pull.
  4. Start Jupyter with jupyter notebook. The command opens a new tab in your web browser.
  5. Go to the folder tutorials and duplicate the original notebook tutorial02.ipynb with a new name my_tutorial02.ipynb (for example) to avoid future conflicts, see understanding git conflicts.
  6. Open, edit and run the notebook my_tutorial02.ipynb from your browser.
  7. When your tutorial is completed, you can go back to the terminal by shutting down the juypter kernels with Control-C.
  8. Save your notebook with git: git add ., and git commit -m tutorial02.


Clean re-installation of the GitHub repository:

  1. For GitHub beginners who wish to re-start from a clean GitHub repository of the course.
  2. Backup the current folder by changing the folder name CE9010_2018 to CE9010_2018_backup (for example).
  3. Re-download the GitHub repository of the course: git clone https://github.com/xbresson/CE9010_2018.
  4. Copy-paste your own notebooks from CE9010_2018_backup/tutorials to the new folder CE9010_2018/tutorials.
  5. Follow instructions The next times.