Seamless support for compressed files.
Creating a compressed file is dead simple:
(require '[squeezer.core :as sc])
(sc/spit-compr "test.txt.gz" "test 1\ntest 2\ntest 3")
You can examine the file in your favourite shell to see that it works.
> file test.txt.gz
test.txt.gz: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT)
You can equally easy slurp
the file back:
(require '[squeezer.core :as sc])
(sc/slurp-compr "test.txt.gz")
;"test 1\ntest 2\ntest 3"
The desired compression algorithm is specified on the basis of the
extension. .gz
for gzip .bz2
for bzip2 and .xz
for xz are
supported. You can override this behaviour by forcing compression
using keyword :compr
.
; Do not do that!!
(sc/spit-compr "test.txt.gz" "test 1\ntest 2\ntest 3" :compr "bzip2")
If you do not believe that this works, ask your favourite shell:
> file test.txt.gz
test.txt.gz: bzip2 compressed data, block size = 900k
Now reading is a pain:
(sc/slurp-compr "test.txt.gz")
; ZipException Not in GZIP format
unless you know what the trick is
(sc/slurp-compr "test.txt.gz" :compr "bzip2")
; "test 1\ntest 2\ntest 3"
To make sure the compression algorithm is adjusted correctly, you can use the mime type of the file (detected, e.g., by the library pantomime).
It is easy.
Add clojure-csv and
squeezer to your project.clj
.
To read first five lines of your big_data.csv.gz
, type in your REPL:
(require '[squeezer.core :as sc] '[clojure-csv :as csv])
(->> "big_data.csv.gz"
sc/reader-compr
csv/parse-csv
(take 5))
Distributed under the Eclipse Public License, the same as Clojure.