collect files that have same content and output file paths in one line separated by tab
make
./bin/samefile [PATH...]
dune build
dune exec ./bin/samefile
-
pull ocaml docker container image
docker pull ocaml/opam
-
run docker container
docker run -v `pwd`:/home/opam/work -it ocaml/opam bash
-
inside container run make
cd work make
- summarize directory which contains same files
- e.g. directory A may be copy of B because A and B has many same files
- map reduce?
- group files by size and compare files whose size are same
- distributed computing
- ocaml on hadoop?
- output action to remove duplicate
- generate link
- remove file
- check date
- detect similar file
- distance
- included
Takashi Masuyama < mamewotoko@gmail.com >
http://mamewo.ddo.jp/