Wrong config in dataset.json

Question

Wrong config in dataset.json

JiarunLiu opened this issue a year ago · 1 comments

Hi, Jun. Thanks for your awesome work.

However, dataset.json in Dataset702_AbdomenMR may have some problems. It leads me the following error during data preprocessing:

(umamba) fgldlb@fgldlb-Precision-Tower-7910:~/Documents/mamba/U-Mamba/data$ nnUNetv2_plan_and_preprocess -d 702 --verify_dataset_i
ntegrity
Fingerprint extraction...
Dataset702_AbdomenMR
Traceback (most recent call last):
  File "/usr/local/anaconda3/envs/umamba/bin/nnUNetv2_plan_and_preprocess", line 33, in <module>
    sys.exit(load_entry_point('nnunetv2', 'console_scripts', 'nnUNetv2_plan_and_preprocess')())
  File "/home/fgldlb/Documents/mamba/U-Mamba/umamba/nnunetv2/experiment_planning/plan_and_preprocess_entrypoints.py", line 182, in plan_and_preprocess_entry
    extract_fingerprints(args.d, args.fpe, args.npfp, args.verify_dataset_integrity, args.clean, args.verbose)
  File "/home/fgldlb/Documents/mamba/U-Mamba/umamba/nnunetv2/experiment_planning/plan_and_preprocess_api.py", line 47, in extract_fingerprints
    extract_fingerprint_dataset(d, fingerprint_extractor_class, num_processes, check_dataset_integrity, clean,
  File "/home/fgldlb/Documents/mamba/U-Mamba/umamba/nnunetv2/experiment_planning/plan_and_preprocess_api.py", line 30, in extract_fingerprint_dataset
    verify_dataset_integrity(join(nnUNet_raw, dataset_name), num_processes)
  File "/home/fgldlb/Documents/mamba/U-Mamba/umamba/nnunetv2/experiment_planning/verify_dataset_integrity.py", line 155, in verify_dataset_integrity
    assert len(dataset) == expected_num_training, 'Did not find the expected number of training cases ' \
AssertionError: Did not find the expected number of training cases (50). Found 60 instead.
Examples: ['amos_0507', 'amos_0508', 'amos_0510', 'amos_0514', 'amos_0517']

and the following changes in data/nnUNet_raw/Dataset702_AbdomenMR/dataset.json work for me:

{
    "channel_names": {
        "0": "MR"
    },
    "labels": {
        "background": 0,
        "liver": 1,
        "right kidney": 2,
        "spleen": 3,
        "pancreas": 4,
        "aorta": 5,
        "inferior vena cava": 6,
        "right adrenal gland": 7,
        "left adrenal gland": 8,  // "right adrenal gland" ==> "left adrenal gland"
        "gallbladder": 9,
        "esophagus": 10,
        "stomach": 11,
        "duodenum": 12,
        "left kidney": 13
    },
    "numTraining": 60,   // 50 -> 60
    "file_ending": ".nii.gz",
    "name": "Dataset702_AbdomenMR",
    "description": "This dataset was from MICCAI AMOS 2022 Challenge. The original dataset contained 60 annotation cases. We annotated another 50 MRI scans as the testing set. The annotations were generated by radiologists with the assistance of MedSAM and ITK-SNAP."
}

The problem of left adrenal gland only appeared in google drive files. But both files in git or google have wrong numTraining.

Answer 1 · 2024-01-19T19:06:22.000Z

Hi @JiarunLiu ,

Thank you very much for pointing out them. It has been fixed!