/split-batch-input-files

Split a given file in several files, for input of batch extraction (ETL) of data

Primary LanguageClojureMIT LicenseMIT

split-batch-input-files

CircleCI Code Size License

This library splits a file in several, so an ETL mechanism (like Spring Batch) is able to read data in parallel, improving overall performance.

This library would be used in JVM-based applications.

Created in Clojure functional language.

Todo list:

  • add function to remove generated files
  • improve unit tests using some cool lib
  • add unit tests
  • add file to Leiningen project, in order to generate a lib jar

Clojure Style: cheet sheet being adopted:

CheatSheet URL

Installation

$ lein uberjar

Usage

Just call the lib function:

split-file [file pieces]

And your file will be splitted in pieces.

Examples

(split-file "test-file.csv" 3)
  • test-file.csv file has five lines
  • It will generate:
    • test-file.csv.0: containing 2 lines
    • test-file.csv.1: containing 2 lines
    • test-file.csv.2: containing the remainder 5th line

License

Copyright © 2019 Daniel Medeiros

Distributed under the MIT License.