/JSONH

Homogeneous Collection Compressor

Primary LanguageJavaScript

JSONH - JSON Homogeneous Collections Compressor

What is JSONH

JSONH is one of the most performant, yet safe, cross programming language, way to pack and unpack generic homogenous collections. Based on native or shimmed JSON implementation, JSONH is nothing different than a procedure performed right before JSON.stringify(data) or right after JSON.parse(data)

It is demonstrated that overall performances of JSONH are up to 3 times faster in compression and 2 times in parsing thanks to smaller and simplified nature of the collection/string.

It is also demonstrated that resulting bandwidth size will be incrementally smaller than equivalent JSON operation reaching, in certain cases, down to 30% of original size and without gzip/deflate compression in place.

JSONH is the latest version of json.hpack project and based on JSONDB concept.

New in version 0.0.2 ( JS only )

  • added experimental and optional schema argument at the end of all methods in order to parse automatically one or more nested homogenous collections
  • covered via unit tests pack/unpack with or without the usage of a schema

What is an Homogenous Collection

Usually a database result set, stored as list of objects where all of them contains the same amount of keys with identical name. This is a basic homogeneous collection example: [{"a":"A","b":"B"},{"a":"C","b":"D"},{"a":"E","b":"F"}] We all have exchange over the network one or more homogenous collections at least once. JSONH is able to pack the example into [2,"a","b","A","B","C","D","E","F"] and unpack it into original collection at light speed.

JSONH is suitable for

  • runtime data compression with or without gzip/deflate on both client and server side
  • creation of static JavaScript files to serve in order to save space on Hard Drive and eventually make runtime gzip/deflate compression easier (smaller input)
  • send huge collection of data from the client to the server and improving performances over JSON.stringify(data) and required network bandwidth

If the generic object/data contains one or more homogenous collections, JSONH is suitable for these cases too via pack and unpack operations. Please read the related post to know more.

JSONH API

Every implementation is suitable for the programming language code style and every method supports original JSON signature. As example the JavaScript version is a global JSONH object with stringify, parse, pack, and unpack methods.

The python version is a module similar to json one with current methods: dump, dumps, load, loads, pack, and unpack.

import jsonh

print(jsonh.dumps(
    [{"a": "A", "b": "B"}, {"a": "C", "b": "D"}, {"a": "E", "b": "F"}],
    separator = (',',':')
))

The php 5 version is a static class plus some function in order to let developers decide for their favorite stile. Extra arguments accepted by json_encode and json_decode are supported as well.

require_once('JSONH.class.php');

// classic style
jsonh_encode($object); // jsonh_decode($str)

// static public style
JSONH::stringify($object); // JSONH::parse($str);

// singleton style
JSONH()->stringify($object); // JSONH()->parse($str)

TODO

  • clean up locally tests and use a standard one able to cover all aspects per each implementation
  • C# version, and hopefully with other developers help other languages too
  • simplified yet cross platform way to map hybrid objects, specifying via white list one or more nested properties to pack on stringify, and unpack on parse (automated and addressed compression for complex objects)

JavaScript And Native JSON Escape Problems

As @garethheyes pointed out by in this post, native JSON.stringify(data) may produce invalid JavaScript. Since JSONH aim is not to change native JSON behavior, neither is JSONH a replacement for JSON, all I can suggest is to perform this replacement when and if data could be corrupted:

JSONH.stringify(data).replace(
    /\u2028|\u2029/g,
    function (m) {
        return "\\u202" + (m === "\u2028" ? "8" : "9");
    })

This will ensure proper escape for those characters plus performances will be still better thanks to reduced string output size (compared with the euivalent operation performed by JSON.stringify(data)).