Investigation: Performance with uploading instances to MinIO
Closed this issue · 4 comments
Description
Currently in an environment we are seeing that saving a study to MinIO is taking around 1 second per slice. This ticket is to track the investigation of that. Unsure where the issue currently lies.
Steps to reproduce
- Deploy MIG and MinIO to an environment
- Send a study to benchmark
Expected behavior
Study is uploaded to storage within an acceptable amount of time
Actual behavior
Study taking > 10 mins in some cases to save
Please share the environment running IG & MinIO.
- CPU cores
- RAM
- Disk size/speed
- Network speed
Hi @mocsharp. Details of the env.
DGX box
1 gpu, 8 vCPU, 32GB ram, Up to 25 Gbps, 225 GB NVMe SSD
Both Head nodes
2 vCPU, 8 GB ram, Seems about 300Mbps
All boxes are attached to a EFS instance.
https://docs.aws.amazon.com/efs/latest/ug/performance.html
I ran 5 studies using MONAI Deploy Lite and each study was completed within 1-2 mins (upload took no longer than a minute each). The MIG container was set to use only 2 CPUs + 8GB ram.
Container stats after all 5 studies are completed.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
148bd3484592 mdl-orthanc 1.74% 295.1MiB / 31.05GiB 0.93% 1.07MB / 964MB 1.22GB / 156MB 66
c35bf2a24db7 mdl-minio 0.03% 244.4MiB / 31.05GiB 0.77% 1.76GB / 1.04GB 4.2GB / 5.52GB 20
d7ab6113b075 mdl-rabbitmq 2.40% 154.9MiB / 31.05GiB 0.49% 915kB / 897kB 42.5MB / 3.71MB 45
46899c9fcd8a mdl-mongodb 0.58% 116.9MiB / 31.05GiB 0.37% 162kB / 1.37MB 173MB / 12.2MB 39
ea62b3f26676 mdl-ig 0.63% 673MiB / 8GiB 8.21% 1.01GB / 980MB 21.1MB / 234MB 21
29f205cb59d5 mdl-tm 0.00% 156.6MiB / 31.05GiB 0.49% 974MB / 776MB 964MB / 90.1kB 21
3eadb0315746 mdl-wm 0.10% 79.47MiB / 31.05GiB 0.25% 23.6MB / 6.41MB 17.7MB / 0B 23
Added ability to switch to disk for storing incoming data before uploading to storage service in PR #166.
Time measured from the first instance is received to the time the workflow request is sent.
When 10 studies (588 instances per study) are sent to IG continuously, using the disk is much faster than memory:
# Memory
Workflow request published to md.workflow.request, message ID=2cff618e-d1a3-4115-ae80-e5e6b4f411b7. Payload took 00:02:25.1398701 to complete.
Workflow request published to md.workflow.request, message ID=a766b539-2a89-41f4-8b13-9ba27a3bac6d. Payload took 00:05:46.5375501 to complete.
Workflow request published to md.workflow.request, message ID=f407ac25-5b54-4e40-b97f-fac16e914874. Payload took 00:09:04.7963726 to complete.
Workflow request published to md.workflow.request, message ID=05c44068-b0e2-446c-b63a-fbedaf7585c4. Payload took 00:11:53.2644454 to complete.
Workflow request published to md.workflow.request, message ID=1a0cf1ec-2cdc-458e-b3ae-c2845cb1daee. Payload took 00:14:17.0859691 to complete.
Workflow request published to md.workflow.request, message ID=60dd9547-2482-4559-9ac5-50d4bd5bbef4. Payload took 00:16:23.0448137 to complete.
Workflow request published to md.workflow.request, message ID=ebedd87f-68ba-4161-9dbc-8304045c75d0. Payload took 00:18:01.6260370 to complete.
Workflow request published to md.workflow.request, message ID=0b8dd992-98de-4ea2-9c5a-9d941f3d0c78. Payload took 00:19:19.5292454 to complete.
Workflow request published to md.workflow.request, message ID=9b0038bd-dbd1-4e2b-9a1d-864f9c0c5c94. Payload took 00:20:02.9085077 to complete.
Workflow request published to md.workflow.request, message ID=d5d00669-c9d8-482a-8c71-a9b975f179d7. Payload took 00:20:11.4483895 to complete.
# Disk
Workflow request published to md.workflow.request, message ID=c09cf45a-9cb9-4ea5-8236-cbe77ff374e7. Payload took 00:00:51.2866232 to complete.
Workflow request published to md.workflow.request, message ID=2196c06f-54d6-401e-9263-869bf13f7741. Payload took 00:01:27.5621902 to complete.
Workflow request published to md.workflow.request, message ID=fda8e1ba-3f37-4a6e-a141-48e3b1fc0e84. Payload took 00:01:57.8332724 to complete.
Workflow request published to md.workflow.request, message ID=42a61e75-fe33-4be6-824b-5352f886a1cf. Payload took 00:02:37.3700811 to complete.
Workflow request published to md.workflow.request, message ID=d5511b17-3909-4034-82a8-5abc7ebc06fc. Payload took 00:03:19.2335252 to complete.
Workflow request published to md.workflow.request, message ID=d058139a-7b7d-4cf1-a832-e712a1767fa5. Payload took 00:04:23.0854715 to complete.
Workflow request published to md.workflow.request, message ID=651af33d-845f-491f-8fdc-a6f0a544d74a. Payload took 00:05:07.1626852 to complete.
Workflow request published to md.workflow.request, message ID=d1e7e27d-9b15-4580-978b-ba0067d08eef. Payload took 00:05:56.3368932 to complete.
Workflow request published to md.workflow.request, message ID=050ca6c7-5daf-4a86-a608-8af9fc748600. Payload took 00:06:26.4512770 to complete.
Workflow request published to md.workflow.request, message ID=63477934-7244-4a11-8c5b-681b125b0ebb. Payload took 00:07:03.9482112 to complete.
Similarly for send a single study:
# Memory
Workflow request published to md.workflow.request, message ID=382d5c48-48fd-44cd-b059-d19dd907f91a. Payload took 00:01:11.5303211 to complete.
# Disk
Workflow request published to md.workflow.request, message ID=3adc8dd6-2dcb-42a9-89b7-3ec021866f87. Payload took 00:00:46.1229609 to complete.