/dict-tempering

Shuffles dictionary entries for better GZIPpability

Primary LanguageJavaScript

Dict Tempering

Change properties order for better GZIPpability. Works with JSON/JSON5 objects (dicts) and arrays and arbitrary newline separated strings.

Usage

dict-tempering is offered as a NodeJS CLI program that you can use with npx or install locally.

It gets its input via stdin and returns the result in stdout.

--type parameter specifies the input/output format. It can handle JSON & JSON5 objects (dictionaries) and arrays as well as any newline separated text

<some-unordered-array.json npx dict-tempering --type=json >reshuffled-array.json
<some-object.json5 npx dict-tempering >reshuffled-object.json5
<newline-separated.txt npx dict-tempering --type=newline >reshuffled.txt

Also available as somewhat easy to use library.

Why?

Quite often there are dictionaries or sets where the order of properties it not important. For example, color names:

{
    aqua: '#0ff',
    black: '#000',
    blue: '#00f',
    fuchsia: '#f0f',
    gray: '#808080',
    green: '#008000',
    lime: '#0f0',
    maroon: '#800000',
    navy: '#000080',
    olive: '#808000',
    purple: '#800080',
    red: '#f00',
    silver: '#c0c0c0',
    teal: '#008080',
    white: '#fff',
    yellow: '#ff0'
}

Ultimately, this code will be minified and served being GZIPped. 154 bytes .min.gz

gzthermal evaluation result

You know the order is not important, but the GZIP/Brotli compressor doesn't know that. Let's help it and change the order of properies. Now it's 145 bytes .min.gz

gzthermal evaluation result

9 bytes (or 6%) size reduction out of thin air. It's not too much, but it's for free!

dict-tempering does just that: It shuffles the movable chunks around so the entire string is compressed better.

Is it worth it?

dict-tempering is most certainly doesn't fit for bundling (ex. a Webpack plugin). Unless you want to make your compilation or bundling process few minutes longer and save a dozen of bytes in exchange.

But it would help you in JS code golf where every byte counts.

Or you can reshuffle properties in the source code once and get a size reduction forever. In fact, this entire project started as a makeshift script for this commit.

Is it fast?

No. It's very slow.

In fact, its time complexity is somehwere near O(n**3) where n is the count of chunks. But the bruteforce would take a factorial O(n!) time, which is much, much longer!

Does this tool produce the best shuffle possible?

No. Maybe. I'm not sure.

But it's good enough.

I need the same but for CBOR, BSON, [insert your own], ...

The code is written with the extension ability in mind. All you need is to create a subclass and to implement _split() and _assemble() methods that fits your needs.

Please refer to the source code and TypeScript typings.

Why the name?

🧑‍💻

Dictionary (data structure) Redirect page → Associative array

From Wikipedia, the free encyclopedia

🧑‍🏭

Tempering is a heat treatment technique applied to ferrous alloys <...> for an increase in elasticity and plasticity. Tempering may also be used on welded steel, to relieve some of the stresses and excess hardness

From Wikipedia, the free encyclopedia