Add remark on multithreading/multiple GPU limitations that FourierFlows.jl imposes
navidcy opened this issue · 4 comments
We should add a remark in the README and in the Docs on this.
[This was mentioned by @ranocha's in their review remarks.]
Regarding multiple GPUs, probably calling a Problem
constructor with dev=GPU()
forced CUDA.jl to use device=0...(?)
E.g., when I asked for 3 GPUs on the HPC I got:
On a machine with 3
julia> prob = SingleLayerQG.Problem(GPU(); nx=n, ny=n+2, Lx=L, β=β, μ=μ, dt=dt, stepper=stepper)
Problem
├─────────── grid: grid (on GPU)
├───── parameters: params
├────── variables: vars
├─── state vector: sol
├─────── equation: eqn
├────────── clock: clock
└──── timestepper: FilteredRK4TimeStepper
shell> nvidia-smi
Tue Mar 9 15:20:01 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:3D:00.0 Off | 0 |
| N/A 35C P0 57W / 300W | 410MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:3E:00.0 Off | 0 |
| N/A 33C P0 41W / 300W | 3MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:B2:00.0 Off | 0 |
| N/A 35C P0 42W / 300W | 3MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 12191 C ...ta/v45/nc3020/julia/julia 407MiB |
+-----------------------------------------------------------------------------+
julia>
I think CUDA.jl
may pick this GPU by default. I think the best solution is to link to CUDA.jl
documentation for choosing a device. Users are also able to do fancier things, like run two problems side by side on different GPUs (some explanation is provided in the CUDA.jl docs for this).
Could you point to this explanation and I’ll add a not in our docs.
Here's some references:
- Blog post on JuliaGPU about device-selection features added in CUDA 1.3. This is just a blog post so the syntax could go out of date.
- Page in JuliaGPU docs about multi-GPU programming. This page is oriented towards people who want to use more than one GPU (not just select one of many).
- Documentation of
CUDA.device
.device!
is adjacent. I don't see the functiondevices
however.
Often the most straightforward approach to using mulitple GPUs is to launch the same script but with different CUDA_VISIBLE_DEVICES
parameters. This approach is outside julia.
$ CUDA_VISIBLE_DEVICES=0 julia --project cool_script.jl
This launches julia but with only one device visible (device "0" from the output of nvidia-smi
). This environment variable is described in nvidia's CUDA documentation:
https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/