ChenghaoMou/text-dedup

Papers, Datasets that use this repo

chris-ha458 opened this issue · 3 comments

Recently the paper for CulturaX a dataset utilizing text-dedup has been released.

I am sure there are many more including the stack.

Would it be valuable to have some mentions regarding these kind of use cases in the README.md?
I'll prepare a PR if there is interest.

Good point, I should also add citations as well.

@ChenghaoMou, maybe zenodo would be an option to you, so that researcher can cite you in their work using a DOI.

Citations and https://github.com/ChenghaoMou/awesome-data-deduplication have been created. Thanks for all the comments.