JSONSki_Nodejs is the Node.Js (Javascript) binding port for JSONSki
JSONSki is a streaming JSONPath processor with fast-forward functionality. During the streaming, it can automatically fast-forward over certain JSON substructures that are irrelavent to the query evaluation, without parsing them in detail. To make the fast-forward efficient, JSONSki features a highly bit-parallel solution that intensively utilizes bitwise and SIMD operations that are prevelent on modern CPUs to implement the fast-forward APIs.
You can download the npm package from here - https://www.npmjs.com/package/jsonski
npm i jsonski
const JSki = require('jsonski')
console.log(JSki.JSONSkiParser("$.features[150].actor.login", "datasets/test.json"));
- We interface the following method:
JSki.JSONSkiParser(args1, args2) //args1 - String(query) and args2 - String(file_location)
- CPUs: 64-bit ALU instructions, 256-bit SIMD instruction set, and the carry-less multiplication instruction (pclmulqdq)
- Operating System: Linux, MacOs (Intel Chips only)
- C++ Compiler: g++ (7.4.0 or higher)
Before starting to use Node-API you need to assure you have the following prerequisites:
-
Node.JS (v14 and above) see: Installing Node.js
-
Python (v3.9) see: Installing Python
-
C++ : g++ (v7.4.0 and above) see: Installing C++
JSONPath is the basic query language of JSON data. It refers to substructures of JSON data in a similar way as XPath queries are used for XML data. For the details of JSONPath syntax, please refer to Stefan Goessner's article.
Operator | Description |
---|---|
$ |
root object |
. |
child object |
[] |
child array |
* |
wildcard, all objects or array members |
[index] |
array index |
[start:end] |
array slice operator |
Consider a piece of geo-referenced tweet in JSON
{
"coordinates": [
40.74118764, -73.9998279
],
"user": {
"id": 6253282
},
"place": {
"name": "Manhattan",
"bounding_box": {
"type": "Ploygon",
"pos": [
[-74.026675, 40.683935],
......
]
}
}
}
JsonPath | Result |
---|---|
$.coordinates[*] |
all coordinates |
$.place.name |
place name |
$.place.bounding_box.pos[0] |
first position of the bounding box in place |
$.place.bounding_box.pos[0:2] |
first two positions of the bounding box in place |
Below is an example usage of Jsonski npm package.
const JSki = require('jsonski')
const fs = require('fs');
console.time();
console.log('JsonSki Runtime', JSki.JSONSkiParser("$[*].entities.urls[*].url", "dataset/twitter_sample_large_record.json"));
console.timeEnd();
file_contents = fs.readFileSync('dataset/twitter_sample_large_record.json')
str = file_contents.toString()
console.log("Javascript Runtime")
console.time();
var json = JSON.parse(str);
console.timeEnd();
- Note: The code snippet above benchmarks performance for Javascript parsing VS JSONSki_nodejs parsing.
[1] Lin Jiang and Zhijia Zhao. JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.
@inproceedings{jsonski,
title={JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding},
author={Lin Jiang and Zhijia Zhao},
booktitle={Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)},
year={2022}
}
Performance of JSONSki_nodejs is compared with simdjson_nodejs and Javascript Parsing - https://github.com/AutomataLab/NPM-JSON-Parser-Benchmarking