huggingface/dataset-viewer

Presidio scan

lhoestq opened this issue · 1 comments

Can help detecting pii and warn authors / users.

I'm working on something similar to the opt-out urls scan and use the default presidio config.

Note that it has false positives so we might have to adapt messages on the Hub (or have some sort of additional filtering)