arx-deidentifier/arx

[INFO REQUEST] Distinction and Separation

umma08 opened this issue · 4 comments

umma08 commented

Can you please point me to documentation that outlines what the Distinction and Separation scores are representing when calculated by the ARX risk assessment tool. I have tried to find it, but it is not clear from the documentation.

umma08 commented

Please take alook at this paper: https://redirect.cs.umbc.edu/~kunliu1/p3dm08/proceedings/2.pdf

thank you - this is quite helpful.

If i may ask a follow up question.

In the Analyze Risk window - I am able to select/de-select entries as 'quasi-identifiers' (QIs), and then a window populates with relative Distinction and Separation scores. If i select a high number of QIs, I receive an error message that says there are too many QIs found, and then a number that states how many. What does this error actually mean in the context of the risk analysis?

This is not an error message. ARX calculates the metrics for all combinations of the selected attributes and this message just indicates that the number of combinations becomes too large. It may be possible to adjust this threshold in the settings.

umma08 commented

It may be possible to adjust this threshold in the settings.

Yes, I had looked at this, but the 'max number of attributes per quasi-identifier' setting seems locked at <= 10.

Inputting any larger a number seems blocked by the settings panel.