kruda.js
Fast data pipeline in the browser.
This is a derivative work of BigDataParser by Dario Segura
WARNING: This is pre-release software under active development.
Installation
yarn add @uncharted.software/kruda
or
npm install @uncharted.software/kruda
Usage
www/index.html
has a working example you can look at.- Look at the documentation
Running the example.
- Run
yarn install
- Download airport data (Airports2.csv)
- Run
node ./src/DSBIN/node/generateDSBIN.js /path/to/Airports2.csv ./www/data/flight_routes.ds.bin
(change/path/to/Airports2.csv
to the location where you stored theAirports2.csv
file) to convertAirports2.csv
to the.ds.bin
format. - Run
yarn start
- In Chrome (must be chrome for now) navigate to
localhost:8090
- The example will allocate a 2GB memory heap
- Load the generated flight routes
ds.bin
file (~3.6 million rows) - Apply a filter to it where:
- The origin airport code equals
SEA
and - The number of passengers equals
110
and - The destination airport code is not equal to
LAX
- The origin airport code equals
- OR:
- The origin airport code equals
MCO
and - The number of passengers is more than
180
and - The number of passengers is less than
200
and - The flight date contains
2001
- The origin airport code equals
- On a laptop running a 4th gen 2.5GHz intel i7 quad core processor:
- Allocating 2GB of memory takes ~1242ms
- Loading the
ds.bin
file (~38MB, ~500MB uncompressed) takes ~861ms - Filtering all ~3.6 million rows with the rules above takes ~115ms
- Filtering all ~3.6 million rows with a filter that returns every single row takes ~462ms
Debugging
Unfortunately many error checks must be disabled for the sake of performance, if
statements are very expensive!
You can re-enable them by changing the flag _DEBUG
in the rollup.config.js
file:
jscc({
prefixes: '/// ',
sourcemap: false,
values: {
_DEBUG: false, // <<<<<< CHANGE THIS LINE TO `true`
},
}),