Readme

This repository contains trials and tribulations in data wrangling with Python, R, MySQL, MongoDB and PowerBI

Introduction

The purpose of this project is to provide a comprehensive and yet simple course in Machine Learning using Python, PowerBI and R. It is a Work in progress.

Navigation structure

The root folders are;

python-3,
powerbi,
r,
mysql
mongodb

These primary folders follow a similar structure, wherein, each is sub-divided into `3-sub-folders, namely;

experiments: holds general coding scripts;
helpful-functions: holds custom or user-defined functions that resulted as a requirement from experiments;
solutions: hold complete case studies based on experiments.

Required Software information

This project uses the following IDE's and programming languages:

Python

IDE is Spyder 4
- How to install Spyder: See here.
  - Open a command prompt window and browse to your local python installation directory. In my case its, c:\users\myusername\miniconda3 and then type pip3 install spyder
  - To launch spyder IDE, open a command prompt window, type the command, spyder3 and hit the enter key. Spyder IDE will launch
Python 3 distribution is Miniconda 3

IDE is RStudio version - 1.1.463
R version - 3.6.1

Python folder/file naming conventions

This repository follows the PEP 8 standard for Python file and folder naming conventions.

A Python module is simply a Python source file, which can expose classes, functions and global variables.
- Modules: should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Example: my_module.py
- Function: Function names should be lowercase, with words separated by underscores as necessary to improve readability. Example: my_function
  - Function arguments: Always use self for the first argument to instance methods.
  - Always use cls for the first argument to class methods.
  - If a function argument's name clashes with a reserved keyword, it is generally better to append a single trailing underscore rather than use an abbreviation or spelling corruption. Thus class_ is better than clss. (Perhaps better is to avoid such clashes by using a synonym.)
- Variable: use a lowercase single letter, word, or words. Separate words with underscores to improve readability. Example: my_variable
A Python package is simply a directory of Python module(s).
- Python packages should also have short, all-lowercase names, although the use of underscores is discouraged. Example: mypackage
- Constant - Use an uppercase single letter, word, or words. Separate words with underscores to improve readability. Example: MY_CONSTANT
- Class - start each word with a capital letter. Do not separate words with underscores. This style is called camel case. Example: MyClass
- Every script will begin with a prefix of aml_. Followed by a distinct meaningful name, that describe the task the script is meant to perform.

R folder/file naming conventions

This repository follows the Hadley Wickham R Style Guide

Folder name: A folder name should be meaningful and multiple words are separated by a hyphen. Example: data-extraction
File name: A File names should end in .r and be meaningful and multiple words are separated by hyphen -. Example: explore-diamonds.R
Variable name: A variable name should be lowercase. Use _ to separate words within a name. Generally, variable names should be nouns. Example: butter good_butter.
Function name: A function name should be lowercase. Use _ to separate words within a name. Generally, function names should be verbs. Example: calculate_salary().
Spacing syntax: Place spaces around all binary operators (=, +, -, <-, etc.). Do not place a space before a comma, but always place one after a comma. Example: average <- mean(feet / 12 + inches, na.rm = T)
Commenting guidelines
- Comment your code. Entire commented lines should begin with # and one space. Comments should explain the why, not the what.
- Use commented lines of - and = to break up your files into scannable chunks.

Contributing

Please see the contributing guide.

Author

Ashish Dutt

duttashi/wrangler