/samefile

collect same file and output them in one line separated by tab

Primary LanguageShellApache License 2.0Apache-2.0

samefile CI

collect files that have same content and output file paths in one line separated by tab

Build with make

Build

make

Run

./bin/samefile [PATH...]

Build with dune

Build

dune build

Run

dune exec ./bin/samefile

Build with Docker

  1. pull ocaml docker container image

    docker pull ocaml/opam
  2. run docker container

    docker run -v `pwd`:/home/opam/work -it ocaml/opam bash
  3. inside container run make

    cd work
    make

TODO

  • summarize directory which contains same files
    • e.g. directory A may be copy of B because A and B has many same files
  • map reduce?
    • group files by size and compare files whose size are same
    • distributed computing
      • ocaml on hadoop?
  • output action to remove duplicate
    • generate link
    • remove file
    • check date
  • detect similar file
    • distance
    • included

Reference


Takashi Masuyama < mamewotoko@gmail.com >
http://mamewo.ddo.jp/