Document requirements to run in cloud/remote environments
kushalkolar opened this issue · 7 comments
I think it would be useful to document what libs are required (and any other requirements) for running wgpu in cloud instances where the GPUs have no physical display output.
This is only for Linux systems, I'll focus on Debian/Ubuntu based distros for now.
EDIT: I think I found the minimal requirements to make this work, if any of these are missing the Vulkan adapter will not show up.
xserver-xorg-core
mesa-vulkan-drivers
libvulkan1
Do you think it would be useful to have a section in the docs for running on cloud environments, maybe in platform requirements ?
Kind of related: #482
Tested that this allows pygfx etc. to run on https://codeocean.com/ and https://lambdalabs.com/service/gpu-cloud . I will test others when I have time.
adapter infos:
Available adapters:
{'adapter_type': 'DiscreteGPU',
'architecture': '',
'backend_type': 'Vulkan',
'description': '525.147.05',
'device': 'Tesla T4',
'device_id': 7864,
'vendor': 'NVIDIA',
'vendor_id': 4318}
{'adapter_type': 'CPU',
'architecture': '',
'backend_type': 'Vulkan',
'description': 'Mesa 23.2.1-1ubuntu3.1~22.04.2 (LLVM 15.0.7)',
'device': 'llvmpipe (LLVM 15.0.7, 256 bits)',
'device_id': 0,
'vendor': 'llvmpipe',
'vendor_id': 65541}
{'adapter_type': 'Unknown',
'architecture': '',
'backend_type': 'OpenGL',
'description': '',
'device': 'Tesla T4/PCIe/SSE2',
'device_id': 0,
'vendor': '',
'vendor_id': 4318}
output from `print_wgpu_report()`
██ system:
platform: Linux-5.4.0-1103-aws-x86_64-with-glibc2.35
python_implementation: CPython
python: 3.10.12
██ versions:
wgpu: 0.15.1
cffi: 1.15.1
jupyter_rfb: 0.4.2
numpy: 1.26.2
pygfx: 0.2.0
pylinalg: 0.4.1
██ wgpu_native_info:
expected_version: 0.19.3.1
lib_version: 0.19.3.1
lib_path: libwgpu_native.so
██ object_counts:
count resource_mem
Adapter: 4
BindGroup: 4
BindGroupLayout: 4
Buffer: 12 1.94K
CanvasContext: 1
CommandBuffer: 0
CommandEncoder: 0
ComputePassEncoder: 0
ComputePipeline: 0
Device: 1
PipelineLayout: 0
QuerySet: 0
Queue: 1
RenderBundle: 0
RenderBundleEncoder: 0
RenderPassEncoder: 0
RenderPipeline: 3
Sampler: 1
ShaderModule: 3
Texture: 4 13.7M
TextureView: 5
total: 43 13.7M
██ wgpu_native_counts:
count mem backend a k r e el_size
Adapter: 4 6.25K vulkan: 3 3 0 0 1.98K
gl: 1 1 0 0 304
BindGroup: 4 1.47K vulkan: 4 4 0 0 368
gl: 0 0 0 0 304
BindGroupLayout: 4 1.28K vulkan: 6 4 2 0 320
gl: 0 0 0 0 232
Buffer: 12 3.55K vulkan: 13 12 1 0 296
gl: 0 0 0 0 240
CanvasContext: 0 0 0 0 0 0 160
CommandBuffer: 1 1.28K vulkan: 0 0 0 1 1.28K
gl: 0 0 0 0 9.42K
ComputePipeline: 0 0 vulkan: 0 0 0 0 288
gl: 0 0 0 0 280
Device: 1 11.9K vulkan: 1 1 0 0 11.9K
gl: 0 0 0 0 10.9K
PipelineLayout: 0 0 vulkan: 3 0 3 0 200
gl: 0 0 0 0 216
QuerySet: 0 0 vulkan: 0 0 0 0 80
gl: 0 0 0 0 88
Queue: 1 184 vulkan: 1 1 0 0 184
gl: 0 0 0 0 136
RenderBundle: 0 0 vulkan: 0 0 0 0 848
gl: 0 0 0 0 848
RenderPipeline: 3 1.68K vulkan: 3 3 0 0 560
gl: 0 0 0 0 712
Sampler: 1 80 vulkan: 1 1 0 0 80
gl: 0 0 0 0 64
ShaderModule: 3 2.40K vulkan: 3 3 0 0 800
gl: 0 0 0 0 824
Texture: 4 3.29K vulkan: 5 4 1 0 824
gl: 0 0 0 0 712
TextureView: 5 1.24K vulkan: 5 5 1 0 248
gl: 0 0 0 0 216
total: 43 34.6K
* The a, k, r, e are allocated, kept, released, and error, respectively.
* Reported memory does not include buffer/texture data.
██ pygfx_adapter_info:
vendor: NVIDIA
architecture:
device: Tesla T4
description: 525.147.05
vendor_id: 4.31K
device_id: 7.86K
adapter_type: DiscreteGPU
backend_type: Vulkan
██ pygfx_features:
adapter device
bgra8unorm-storage: ✓ -
depth32float-stencil8: ✓ -
depth-clip-control: ✓ -
float32-filterable: ✓ ✓
indirect-first-instance: ✓ -
rg11b10ufloat-renderable: ✓ -
shader-f16: ✓ -
texture-compression-astc: - -
texture-compression-bc: ✓ -
texture-compression-etc2: - -
timestamp-query: ✓ -
MultiDrawIndirect: ✓ -
MultiDrawIndirectCount: ✓ -
PushConstants: ✓ -
TextureAdapterSpecificFormatFeatures: ✓ -
VertexWritableStorage: ✓ -
██ pygfx_limits:
adapter device
max_bind_groups: 8 8
max_bind_groups_plus_vertex_buffers: 0 0
max_bindings_per_bind_group: 1.00K 1.00K
max_buffer_size: 18.4E18 268M
max_color_attachment_bytes_per_sample: 0 0
max_color_attachments: 0 0
max_compute_invocations_per_workgroup: 1.02K 1.02K
max_compute_workgroup_size_x: 1.02K 1.02K
max_compute_workgroup_size_y: 1.02K 1.02K
max_compute_workgroup_size_z: 64 64
max_compute_workgroup_storage_size: 49.1K 49.1K
max_compute_workgroups_per_dimension: 65.5K 65.5K
max_dynamic_storage_buffers_per_pipeline_layout: 16 16
max_dynamic_uniform_buffers_per_pipeline_layout: 15 15
max_inter_stage_shader_components: 128 128
max_inter_stage_shader_variables: 0 0
max_sampled_textures_per_shader_stage: 1.04M 1.04M
max_samplers_per_shader_stage: 1.04M 1.04M
max_storage_buffer_binding_size: 2.14G 2.14G
max_storage_buffers_per_shader_stage: 1.04M 1.04M
max_storage_textures_per_shader_stage: 1.04M 1.04M
max_texture_array_layers: 2.04K 2.04K
max_texture_dimension1d: 32.7K 32.7K
max_texture_dimension2d: 32.7K 32.7K
max_texture_dimension3d: 16.3K 16.3K
max_uniform_buffer_binding_size: 65.5K 65.5K
max_uniform_buffers_per_shader_stage: 1.04M 1.04M
max_vertex_attributes: 32 32
max_vertex_buffer_array_stride: 2.04K 2.04K
max_vertex_buffers: 16 16
min_storage_buffer_offset_alignment: 32 32
min_uniform_buffer_offset_alignment: 64 64
██ pygfx_caches:
count hits misses
full_quad_objects: 1 0 2
mipmap_pipelines: 0 0 0
layouts: 2 0 4
bindings: 2 0 2
shader_modules: 2 0 2
pipelines: 2 0 2
shadow_pipelines: 0 0 0
██ pygfx_resources:
Texture: 6
Buffer: 19
Available adapters on a lambdalabs instance:
{'adapter_type': 'DiscreteGPU',
'architecture': '',
'backend_type': 'Vulkan',
'description': '535.129.03',
'device': 'Quadro RTX 6000',
'device_id': 7728,
'vendor': 'NVIDIA',
'vendor_id': 4318}
{'adapter_type': 'CPU',
'architecture': '',
'backend_type': 'Vulkan',
'description': 'Mesa 23.2.1-1ubuntu3.1~22.04.2 (LLVM 15.0.7)',
'device': 'llvmpipe (LLVM 15.0.7, 256 bits)',
'device_id': 0,
'vendor': 'llvmpipe',
'vendor_id': 65541}
{'adapter_type': 'Unknown',
'architecture': '',
'backend_type': 'OpenGL',
'description': '',
'device': 'Quadro RTX 6000/PCIe/SSE2',
'device_id': 0,
'vendor': '',
'vendor_id': 4318}
Performance is really good!
code_ocean.mp4
This is enough for software rendering:
wgpu-py/.github/workflows/ci.yml
Line 120 in 1254487
I don't know about using GPUs on hosted environments... Usually they're locked down or containerized and it requires setup steps specific to the hosting environment.
I don't know about using GPUs on hosted environments... Usually they're locked down or containerized and it requires setup steps specific to the hosting environment.
They usually come pre-loaded with nvidia drivers and CUDA libs and we have got it to work in containers. I'll see if the same 3 dependencies are enough on a few other major providers and that should be enough guidance for many users.
Do you think it would be useful to have a section in the docs for running on cloud environments, maybe in platform requirements ?
I think a section on that page makes sense. I think the "platform-requirements" should focus on desktop/local usage. So a new h2 after it for "Cloud compute", with two subheadings, one for "with GPU" and one "software rendering", which should be what is now "Installing LavaPipe on Linux".
I can confirm that with xserver-xorg-core
, mesa-vulkan-drivers
, and libvulkan1
installed, fastplotlib is now working in the Allen Institute's Code Ocean environment. Thanks for figuring this out!
I can confirm that with
xserver-xorg-core
,mesa-vulkan-drivers
, andlibvulkan1
installed, fastplotlib is now working in the Allen Institute's Code Ocean environment. Thanks for figuring this out!
I recommend checking that the hardware vulkan adapter is at the top to make sure you're not using lavapipe (software rendering):
import wgpu
import pprint
for a in wgpu.gpu.enumerate_adapters():
pprint.pprint(a.request_adapter_info())
Should get something like this:
{'adapter_type': 'DiscreteGPU',
'architecture': '',
'backend_type': 'Vulkan',
'description': '525.147.05',
'device': 'Tesla T4',
'device_id': 7864,
'vendor': 'NVIDIA',
'vendor_id': 4318}
{'adapter_type': 'CPU',
'architecture': '',
'backend_type': 'Vulkan',
'description': 'Mesa 23.2.1-1ubuntu3.1~22.04.2 (LLVM 15.0.7)',
'device': 'llvmpipe (LLVM 15.0.7, 256 bits)',
'device_id': 0,
'vendor': 'llvmpipe',
'vendor_id': 65541}
{'adapter_type': 'Unknown',
'architecture': '',
'backend_type': 'OpenGL',
'description': '',
'device': 'Tesla T4/PCIe/SSE2',
'device_id': 0,
'vendor': '',
'vendor_id': 4318}
(next release of fastplotlib and current fastplotlib@main
will also display all adapters and indicate the default adapter when you import)
Update: Works very well on codeocean and lambdalabs, high performance with jupyter-rfb. Also works on google cloud but the rfb performance makes it unusable.
Couldn't get it working on AWS SageMaker, I even tried installing kde-plasma-desktop
. Also tried installing wgpu
from pip and conda. nvidia-smi
worked so the nvidia drivers were installed, IDK ¯\_(ツ)_/¯ .
I'll add general guidance to the docs that you need <...> system packages installed, but your mileage may vary.
@jsiegle I forgot to mention, you'd want these apt packages as well which makes a huge difference in the rfb performance:
libjpeg-turbo8-dev libturbojpeg0-dev
And simplejpeg
via pip.