/koleksyon

A collection of Python functions for statistical summary, data encoding and error quantification

Primary LanguageJupyter NotebookMIT LicenseMIT

koleksyon

A collection of Python functions for statistical summary, data encoding and error quantification

Introduction

Hi and welcome to the Koleksyon library! This library is constructed to help new data scientists get results faster when they are working on problems in healthcare... specifically when they are working on 'back office' types of problems like population management, resource optimization, forecasting, and so on. The functions in koleksyon are developed from common questions that most data science projects have. The intent is NOT to re-invent the wheel. When libraries like Pandas have a natural way of doing something, people will use that! However sometimes multiple libraries are required to acomplish common tasks and figuring out everything that is needed takes valuable time.

Typically >90% of a project is understanding what the problem is we need to solve, and how we need to look at the data to solve the problem. Nine times out of ten project proponenets come to us asking for 'unsupervised learning' approaches because they don't know what to look for, but when they are given an unsupervised approach, they quickly understand that they want to apply supervised learning on a buisness workflow. The reason for this is simple, machine learning is an amazing tool for finding patterns in data and automating tedious tasks that are difficult to describe.

Tools such as pandas, scikit-learn, matplotlib, numpy, tensorflow and so on are great, but they often too low level to rapidly get to answers of predictability in a health care setting. This is what Koleksyon is for.