realrolfje/anonimatron

Support Dutch BSN in numeric columns

Closed this issue · 5 comments

Anonimatron version: 1.9.3

Currently anonimatron generates an Exception if a Dutch BSN is configured for a numeric column in a database table, like so:

Exception in thread "main" java.lang.RuntimeException: java.lang.ClassCastException: java.math.BigDecimal incompatible with java.lang.String
        at com.rolfje.anonimatron.jdbc.JdbcAnonymizerService.processTableColumns(JdbcAnonymizerService.java:239)
        at com.rolfje.anonimatron.jdbc.JdbcAnonymizerService.anonymizeTableInPlace(JdbcAnonymizerService.java:155)
        at com.rolfje.anonimatron.jdbc.JdbcAnonymizerService.anonymize(JdbcAnonymizerService.java:94)
        at com.rolfje.anonimatron.Anonimatron.anonymize(Anonimatron.java:99)
        at com.rolfje.anonimatron.Anonimatron.main(Anonimatron.java:67)
Caused by: java.lang.ClassCastException: java.math.BigDecimal incompatible with java.lang.String
        at com.rolfje.anonimatron.synonyms.StringSynonym.setFrom(StringSynonym.java:43)
        at com.rolfje.anonimatron.anonymizer.DutchBSNAnononymizer.anonymize(DutchBSNAnononymizer.java:34)
        at com.rolfje.anonimatron.anonymizer.AnonymizerService.anonymize(AnonymizerService.java:97)
        at com.rolfje.anonimatron.jdbc.JdbcAnonymizerService$2.processColumn(JdbcAnonymizerService.java:140)
        at com.rolfje.anonimatron.jdbc.JdbcAnonymizerService.processTableColumns(JdbcAnonymizerService.java:210)
        ... 4 more

It would be nice to support numeric columns as well, because a Dutch BSN is a valid number.

I would strongly advise against storing a BSN in a numeric column. It may cause the same impracticalities as storing a bank account number in a numeric column, which caused interesting problems in some systems when IBAN was introduced.

Just because something has digits, does not make it a scalar number. There is no "bigger" or "smaller" BSN, and they can not be added or subtracted. In that sense, it is not a number. Particularly if it is an identifier for which the syntax rules are external to your system.

Nothing against making this technically possible, but it makes me wonder... why?

Point taken and yes, I agree with all the statements above. The point is that more than a few legacy systems have been implemented this way, i.e. by storing social security numbers in a numeric column.

True. I've seen systems like that. I usually push a bit to get that fixed. But technically speaking, yes it should be possible to anonymize number-like data based on numeric fields in a database.

That's exactly my point, because I believe it's not the responsibility of the anonymisation tool to determine what column definition is used to store BSN in an application's database.

Merged into develop, part of next release.