/checksum-spark

Checksum files with Apache Spark

Primary LanguageScalaApache License 2.0Apache-2.0

checksum-spark Build Status Release codecov

checksum-spark verifies and creates checksum files using Apache Spark.

Design Goals

  • Keep CLI as close to GNU coreutils (e.g. md5sum) as possible.
  • Easy to use with HDFS or any storage supported by Apache Spark.

Download

Releases

Check the releases page for the latest version.

Via Maven repository

checksum-spark is published to jitpack. You can fetch it with Maven:

mvn org.apache.maven.plugins:maven-dependency-plugin:2.1:get -DrepoUrl=https://jitpack.io -Dartifact=io.mola:checksum-spark_2.11:v0.2.0

Or with Coursier:

coursier fetch --repository https://jitpack.io io.mola:checksum-spark_2.11:v0.2.0,classifier=assembly

License

Copyright © 2018 Santiago M. Mola

This project is released under the terms of the Apache License 2.0.