Model tf_2dunet - Plan initialisation fails expecting /raid/datasets/MICCAI_BraTS_2019_Data_Training/HGG/0
Closed this issue · 7 comments
Describe the bug
While trying the Quick Start Guide for model tf_2dunet, the plan initialisation step is failing.
Last few lines from the error message:
File "/home/azureuser/openfl/tests/openfl_e2e/my_workspace/src/tfbrats_inmemory.py", line 29, in __init__
X_train, y_train, X_valid, y_valid = load_from_nifti(parent_dir=data_path,
File "/home/azureuser/openfl/tests/openfl_e2e/my_workspace/src/brats_utils.py", line 94, in load_from_nifti
subdirs = os.listdir(path)
FileNotFoundError: [Errno 2] No such file or directory: '/raid/datasets/MICCAI_BraTS_2019_Data_Training/HGG/0'
To Reproduce
Steps to reproduce the behavior:
- Follow the steps mentioned in Quick Start replacing model torch_cnn_mnist with tf_2dunet
- Create workspace, certify it.
- Generate CSR request for aggregator with CA signing it.
- Initialise the plan -
fx plan initialize
At this step the error is thrown.
Expected behavior
There should be no error during plan initialisation.
Machine
- Ubuntu 22.04
Additional
There is this README.md which mentions dataset structure for MICCAI_BraTS_2019_Data_Training.
But how to download it exactly? Is this mentioned anywhere?
For practice purpose, I found this link having dataset - https://www.kaggle.com/datasets/aryashah2k/brain-tumor-segmentation-brats-2019 but it contains too many subfolders as opposed to expected 0 and 1.
fx plan initialize
is currently taking the first entry from data.yaml. You either need to directly overwrite this to point at your dataset, or you can invoke the --input_shape
flag if you know the expected data shape
To gain access to the data, originally you needed to send an access request to the MICCAI BraTS challenge, but that Kaggle link actually looks like the proper data. If so, the README.md includes steps to shard the data
Hi @noopurintel,
I downloaded the dataset from Kaggle link that you have mentioned. https://www.kaggle.com/datasets/aryashah2k/brain-tumor-segmentation-brats-2019
After that I followed README.md.
I will list out the steps for you:
- Download the dataset from https://www.kaggle.com/datasets/aryashah2k/brain-tumor-segmentation-brats-2019
- Unzip the dataset
unzip archive.zip -d /raid/datasets/
- Use Tree command to check unziped dataset:
/raid/datasets# tree $DATA_PATH -L 2
.
-- MICCAI_BraTS_2019_Data_Training
|-- HGG
|-- LGG
|-- name_mapping.csv
`-- survival_data.csv
3 directories, 2 files
cd MICCAI_BraTS_2019_Data_Training/HGG/
export SUBFOLDER=HGG
- Run this code in terminal for 2 collaborators and change n as per number of collaborators as mentioned in the README.
for f in *;
do
d=$(printf $((i%2))); # change n to number of data slices (number of collaborators in federation)
mkdir -p $d;
mv "$f" $d;
let i++;
done
- Check the result
raid/datasets/MICCAI_BraTS_2019_Data_Training/HGG# tree -L 1
.
|-- 0
`-- 1
2 directories, 0 files
- Follow Quick Start Guide
INFO Creating Initial Weights File 🠆 save/tf_2dunet_brats_init.pbuf plan.py:195
INFO FL-Plan hash is 196b877a93866735ca18687a2d1f94ad6dca8a3f0de541f84ca267ccc5fd63be00dd488102c0540c0b4efb434653b2c0 plan.py:287
INFO ['plan_196b877a'] plan.py:222
✔️ OK
For the error mentioned below, I have fix in #1178.
File "<__array_function__ internals>", line 200, in concatenate
ValueError: need at least one array to concatenate
@noopurintel can you confirm this and let us know, and we will close it accordingly.
@rahulga1 @tanwarsh @kta-intel - We tried this. Below are our observations.
- With 4 CPUs and 16 GB RAM - the initialization process is killed on its own after 2-3 minutes into the run.
- With 16 CPUs and 64 GB RAM - it took 4 hours to complete 2 rounds of training.
Could you please suggest/document what minimal configuration is required to test this? Also, in general how much time it expects to take for one round of training. It will be helpful for the users.
Hi @noopurintel,
I was able to complete the experiment with 10 rounds in 4-5 hrs with 16 CPU and 64 RAM.
Hi @noopurintel, Closing this issue as the error is resolved.