Load CSV datasets and map / transform the data for use.
- Load CSV datasets
- Guess types and cast data fields to types
- Calculate stats: global, fields and pairwise (field by field correlations etc)
- Map datasets to other datasets using transform functions
This is designed to take a specification object (JSON) and load a dataset and optionally map values to requested ranges.
The JSON specification objects can be saved in your application for use in presets.
Status: ALPHA
Currently all transformation functions are to be supplied when you
Load a dataset from disk, calculate statistics and apply transformations
Returns: Object
- Dataset
Param | Type | Description |
---|---|---|
functions | Object |
Named function registery |
path | String |
|
statsParams | Object |
|
mapParams | Array.<Object> |
Load and parse a dataset from path. Stats are not yet calculated so types are unknown and all fields are strings.
Returns: Promise.<Object>
- Promise for a dataset
Param | Type | Description |
---|---|---|
path | String |
Absolute path to file |
Load and parse a dataset and calculate stats and coerce types of field values.
Returns: Promise.<Object>
- Promise for a dataset
Param | Type | Description |
---|---|---|
path | String |
Absolute path to file |
functions | Object |
Named function registery |
statsParams | Object |
The stats object from params |
Create a dataset object from an array of objects
Returns: Object
- dataset - {data, fields, path}
Param | Type | Description |
---|---|---|
data | Object |
[{field: value, field2: value}, ...] |
fields | Array.<String> |
Field names |
path | String |
Calculate statistics (minval, maxval, avg etc.) for a dataset using a stats specification.
Returns: Object
- stats
Param | Type | Description |
---|---|---|
functions | Object |
Named function registery |
statsParams | Object |
The stats object from params |
dataset | Object |
As returned by loadDataset or from a previous transformation. |
Calculate statistics and return a new dataset objects with .stats set
Returns: Object
- dataset
Param | Type | Description |
---|---|---|
functions | Object |
Named function registery |
statsParams | Object |
|
dataset | Object |
Having guessed types with calculateStats, cast all fields to the guessed types.
- This converts '1.1' to 1.1
- Enums of strings to their integer indices
- Date strings to Date objects
- String fields with high cardinality remain strings
Returns: Object
- Dataset object with values cast to guessed types
Param | Type | Description |
---|---|---|
dataset | Object |
Dataset object |
mapDataset
Map input fields to output fields using mapping functions as specified in mapParams
{
input: 'inFieldName',
output: 'outFieldName'
fn: 'linear', // named function in functions registry
args: [0, 1] // parameters for linear mapping function
}
fn may be a String key to a function in the functions registery or a function(stats, fieldName, [...args], value)
Param | Type | Description |
---|---|---|
functions | Object |
Named function registery |
mapParams | Array.<Object> |
|
dataset | Object |
makeMapFunction from mapParam
mapParam: .fn .args
Where fn is a Function or a String key to lookup Function in functions
Function should accept: (stats, fieldName, ...args, value)
Args are optional array of params to configure your mapping function. eg. [minval, maxval]
This curries the function and calls it with: (stats, fieldName, ...args) and returns that mapping function which accepts just value and returns the mapped value.
Returns: function
- any => any
Param | Type | Description |
---|---|---|
functions | Object |
Named function registery |
stats | Object |
|
mapParam | Object |
Get a single row as an Object.
As this function is curried you can bake in dataset and fields:
getter = getRow(dataset, null); // returns a function with first two args satisfied
getter(12); // get row 12
Returns: Object
- - The object for this row.
Param | Type | Description |
---|---|---|
dataset | Object |
|
fields | Array.<string> | null |
Optionally select just the fields you need. null selects all fields. |
Get a single data value (row, column)
As this function is curried you can bake in dataset and field:
getter = getCell(dataset, 'sepalLength');
getter(12); // get value at row 12, field 'sepalLength'
Returns: mixed
- - The value for this cell.
Param | Type | Description |
---|---|---|
dataset | Object |
|
field | String |
key of the field to select |
index | Number |
integer index of row |
Get all values for a column
As this function is curried you can bake in dataset:
getter = getColumn(dataset);
getter('sepalLength'); // get the array of values for the sepalLength field
Returns: Array.<mixed>
- - Array of values for this field
Param | Type | Description |
---|---|---|
dataset | Object |
|
field | String |
key of the field to select |