/Snippets

A collection of snippets and functions that I regularly use in my workflows as a data scientist. These snippets are utility functions that speed up my work, and focus on Pandas, Numpy and visualization libraries.

Primary LanguageJupyter NotebookMIT LicenseMIT

Snippets

About

A collection of snippets and functions that I regularly use in my workflows as a data scientist. These snippets are utility functions that speed up my work, and focus on Pandas, Numpy and visualization libraries.

image

Pandas

Snippet Code Description
Boosted value_counts Code This function improves the value_counts function by outputing absolute and normalized counts simultaneously, for faster analysis. It also sets the default value of dropna to False, so any NaNs that exist easily spotted.
Pandarallel Configuration Code Pandas isn't great when it comes to handling large amounts of data, mainly because it natively uses only a single core. Pandarallel is a very straightforward alternative to parallelize pandas code.

IPython + Miscellaneous

Snippet Code Description
Color print in jupyter notebooks Code Coloring specific values in your output can be an easy way to highlight important information. While there are packages like termcolor or colorama, I find that simply using ANSI color outputs works best.
Progress bars in jupyter notebooks Code tqdm is a package that lets you create progress bars. While it has notebook specific versions via tqdm_notebook, I find that directly using tqdm works just as well without the hassle of setting up ipywidgets and IProgress.