microsoft/Tools-for-Health-Data-Anonymization

configuration.json in CommandLineTool doesn't parse because of {"tag":"(50xx,xxxx)", "method":"remove"}

TD-peter opened this issue · 4 comments

Number.Parsing throws an exception: System.FormatException: 'The input string '50xx' was not in a correct format.' when trying to parse the configuration.json file in the project. The offending line is:

    {"tag":"(50xx,xxxx)", "method":"remove"}, 
    {"tag":"(60xx,4000)", "method":"remove"}, 
    {"tag":"(60xx,3000)", "method":"remove"}, 

The (50xx,xxxx) etc. values are obviously not valid.

Also:
{"tag":"DA", "method":"dateshift"},
{"tag":"DT", "method":"dateshift"}

Do not parse.

Please remove or update this line. Location of the file:
DICOM/src/Microsoft.Health.Dicom.Anonymizer.CommandLineTool/configuration.json

Thank you so much for raising this issue. Our team is looking into it and will provide you an update as soon as possible.

Hi @TD-peter, can you please provide the steps to reproduce this issue? I have not been able to reproduce while using the command .\Microsoft.Health.Dicom.Anonymizer.CommandLineTool.exe -i myInputFile -o myOutputFile (or .\Microsoft.Health.Dicom.Anonymizer.CommandLineTool.exe -i myInputFile -o myOutputFile -c ../../../configuration.json) from the $SOURCE\DICOM\src\Microsoft.Health.Dicom.Anonymizer.CommandLineTool\bin\Debug\net6.0 folder. What command were you using when you encountered this problem? Thank you!

Hi Alexa, sorry for my late reply.

I was thrown off by the exception being thrown, but then I discovered there is an AnonymizerMaskedTagRule that will handle those wildcards. So indeed, the code does handle the wildcards correctly.

One (minor) issue remains though:

configuration.json contains a duplicate entry:

    {"tag":"(3008,0105)", "method":"remove"}, 
    {"tag":"(3008,0105)", "method":"remove"}, 

Thanks!

Duplicate tag has been removed in #222 . Parse exception is expected behavior, so this bug can be closed.