squeezer

Seamless support for compressed files.

Usage

Creating a compressed file is dead simple:

(require '[squeezer.core :as sc])
(sc/spit-compr "test.txt.gz" "test 1\ntest 2\ntest 3")

You can examine the file in your favourite shell to see that it works.

> file test.txt.gz
test.txt.gz: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT)

You can equally easy slurp the file back:

(require '[squeezer.core :as sc])

(sc/slurp-compr "test.txt.gz")

;"test 1\ntest 2\ntest 3"

The desired compression algorithm is specified on the basis of the extension. .gz for gzip .bz2 for bzip2 and .xz for xz are supported. You can override this behaviour by forcing compression using keyword :compr.

; Do not do that!!

(sc/spit-compr "test.txt.gz" "test 1\ntest 2\ntest 3" :compr "bzip2")

If you do not believe that this works, ask your favourite shell:

> file test.txt.gz
test.txt.gz: bzip2 compressed data, block size = 900k

Now reading is a pain:

(sc/slurp-compr "test.txt.gz")
; ZipException Not in GZIP format

unless you know what the trick is

(sc/slurp-compr "test.txt.gz" :compr "bzip2")
; "test 1\ntest 2\ntest 3"

To make sure the compression algorithm is adjusted correctly, you can use the mime type of the file (detected, e.g., by the library pantomime).

FAQ

How do I lazily read compressed csv file, record by record?

It is easy.

Add clojure-csv and squeezer to your project.clj.

To read first five lines of your big_data.csv.gz, type in your REPL:

(require '[squeezer.core :as sc] '[clojure-csv :as csv])

(->> "big_data.csv.gz"
     sc/reader-compr
     csv/parse-csv
     (take 5))

License

Distributed under the Eclipse Public License, the same as Clojure.

lopusz/squeezer

squeezer

Usage

FAQ

How do I lazily read compressed csv file, record by record?

License