NAIST-SE/CodeHash

DirectComparisonMain should support a threshold for a particular similarity metric

Opened this issue · 0 comments

The current -th option excludes a file pair only if all similarity metrics is less than a specified threshold.
This mechanism reports too many file pairs (in particular, pair of a tiny file and a large file) because their overlap coefficient often becomes high.