ImagingInformatics/hackathon-dataset

Issues with Joe's studies

Closed this issue · 3 comments

The files
imaging_study.1.3.6.1.4.1.14519.5.2.1.7777.9002.168969969813049794019998800433.json
and
imaging_study.1.3.6.1.4.1.14519.5.2.1.7777.9002.198875685720513246512710453733.json
The content reference the SAME study (both accession and study UID) initially, but then the list different attributes like number of series and number of instances.

Suggested actions:

  1. If the study UIDs in the file name are to be believed, then "imaging_study.1.3.6.1.4.1.14519.5.2.1.7777.9002.168969969813049794019998800433.json" references a study that does not exist - therefore suggest the file to be deleted.
  2. imaging_study.1.3.6.1.4.1.14519.5.2.1.7777.9002.198875685720513246512710453733.json references 3 series, but the study in the VNA has only 1 series. Either we update the file to reflect that fact, or we should find the remainder series from the original TCIA Dataset.

@shadowdoc - would appreciate your thoughts here.

Sounds like deleting ...8800433.json is a good idea. Looking at TCIA there are duplicate studies in their archive with both of these StudyUIDs.

Would update ...453733.json to match the objects. Can get new DICOM objects by going to

https://public.cancerimagingarchive.net/nbia-search/

Scrolling down to the bottom and putting TCGA-17-Z058 into Subject ID.

Sorry, that wasn't clear - sounds like ...453733.json has the series correct. Can recover the DICOM data by downloading. LMK If you want me to download, I have an account.

Thanks, Marc. You can leave it with me, I will download the study and fix the DICOM data when I get a chance in the next little while.