Introduction
This is to understand the functions available in various programming language to carry out the functions of a general data scientist.
Imagine yourself to be a Excel user, what are the general tools that you use to carry out an analysis on a table.
Assumptions
We assume that we are working on a table, like that of an excel file.
-
Filter a column by a specific value -- Find by text (equals and does not equal) -- Max -- Min -- Is less than -- Is greater than -- Is between -- Contains -- Sort ascending and descending
-
Convert the type of a column to another type, e.g. String to int
-
Import data file into a matrix or an array
-
Export a matrix or array into a data file (e.g. csv)
-
Return the unique values in a column
-
Count columns
-
Count rows
-
Add a column
-
Remove a column
-
Remove a row
-
Add a row
-
Excel index function
-
Excel vlookup like function
-
Return empty row in a specific column
-
Identity any empty rows
-
Group data (e.g. if a column has a category)