Inference FPS is only ~10 in 4090 GPU

Question

Inference FPS is only ~10 in 4090 GPU

Closed this issue 4 months ago · 5 comments

Hello, I have finished training and evaluating on the subject 1 dataset on a 4090 GPU. However, the visual quality results are similar to those reported in the paper. However, the inference FPS is only ~10, way slower than the results reported in the paper. Is there any potential reason for this result? Thanks!

Answer 1 · 2024-01-13T08:17:18.000Z

Hi, for measuring the FPS, it's needed to use the web demonstration as it gives the correct speed under WebGL implementation specified in the paper. You may follow the instructions in the readme to run the web demo, and to test speed under different resolutions, you may tweak these lines to match the resolution reported in the paper.

The Python inference script is mainly used for quality evaluation and metric calculation, and since it uses nvdiffrast to produce images, which is based on software rasterization implementation based on CUDA, it introduces a lot more overhead compared to the WebGL implementation.

Answer 2 · 2024-01-16T02:13:09.000Z

Thank you for your reply. I have ran the web demo. I noticed that the FPS always keeps around 60 FPS. I guess this is because of my monitor's fresh rate (my monitor is 4K/60FPF)? Also, I tried to increase the rendering resolution. But it will lead to the rendered content filling up the whole monitor so thus I cannot observe the FPS. Could you please let me know how I can get the actual rendering resolution as you reported in the paper?

Also, I would like to ask, If I want to render it without using web, how should I render it? Should I use opengl/pyrender for such purpose?

Answer 3 · 2024-01-16T12:15:23.000Z

You can launch Chrome with the following command line argument:

--disable-gpu-vsync --disable-frame-rate-limit

You can use native OpenGL implementation to achieve the same results in the web demo. The rendering process should be the same as in WebGL, i.e. render the multi-layered meshes from the inner layer to the outer layer, and for each layer, compute vertex positions with FLAME weights in the vertex shader, and blend textures with weights from the appearance decoder MLP in the fragment shader.

Answer 4 · 2024-01-16T21:18:47.000Z

You can launch Chrome with the following command line argument:
--disable-gpu-vsync --disable-frame-rate-limit
You can use native OpenGL implementation to achieve the same results in the web demo. The rendering process should be the same as in WebGL, i.e. render the multi-layered meshes from the inner layer to the outer layer, and for each layer, compute vertex positions with FLAME weights in the vertex shader, and blend textures with weights from the appearance decoder MLP in the fragment shader.

Thank you for your reply. It works! I have another question that how you measure the FPS in web-based demo and is there any way that I can record the FPS instead of just observing its value shown in the wegpage?

Answer 5 · 2024-01-18T17:44:43.000Z

I think you might need to add some JavaScript code to record the FPS values. FPS calculation logic is around here. It should be intuitive to add some recordings in an array and print them out.