/Cheatsheets

Collection of code snippets for Data Science and more

Primary LanguageJupyter Notebook

Cheatsheets

Collection of bash commands and code snippets for Data Science and more.

This repository is not intended as a learning base, it is just a collections of commands and snippets that can be used to refresh your memory.

Apache-Hadoop

Basic commands and templates for

  • Hadoop FS
  • Hive
  • Pig
  • Spark
  • SQOOP

bash

Collection of sample Bash commands

  • Simple and general commands
  • Data manipulation commands

Python

  • Creating isolated environment using a Makefile
    • Common python requirements for data science to be installed by pip
  • Using python libraries
    • Pandas
    • Numpy
    • Scikit learn

Python Jupyter Notebook

A collection of jupyter notebooks

  • Pandas
  • Preconditions
  • Read-or-persist (template for reading from a remote source and persist data to local folder)
  • Template (A template notebook tu jump start with common requirements)
  • Uncertainties (Use of uncertainties library to account for error in calculations)

Server Ubuntu 16.04

  • Setup of environment
  • Install components
    • Docker Machine
    • Docker Swarm
    • Docker Compose