wahn/rs_pbrt

anim-bluespheres.pbrt

wahn opened this issue · 17 comments

wahn commented

Compare C++ vs. Rust render times (on purism laptop) :

$ time ~/builds/pbrt/release/pbrt anim-bluespheres.pbrt
pbrt version 3 (built Apr  1 2019 at 17:45:44) [Detected 4 cores]
Copyright (c)1998-2018 Matt Pharr, Greg Humphreys, and Wenzel Jakob.
The source code to pbrt (but *not* the book contents) is covered by the BSD License.
See the file LICENSE.txt for the conditions of the license.
Rendering: [++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++]  (897.6s)
Statistics:
...
real    14m58.087s
user    52m55.853s
sys     0m0.932s

So, the C++ implementation needs about 15 minutes on a 4 processor Linux laptop.

wahn commented
$ time ~/git/self_hosted/Rust/pbrt/target/release/rs_pbrt -i anim-bluespheres.pbrt
pbrt version 0.7.1 [Detected 4 cores]
Copyright (c) 2016-2019 Jan Douglas Bert Walter.
Rust code based on C++ code by Matt Pharr, Greg Humphreys, and Wenzel Jakob.
Film "image"
  "string filename" ["anim-bluespheres.exr"]
  "integer xresolution" [800]
  "integer yresolution" [400]
Sampler "sobol"
  "integer pixelsamples" [1024]
Integrator "path"
Rendering with 4 thread(s) ...
1250 / 1250 [=========================================================================] 100.00 % 0.87/s
Writing image "pbrt.png" with bounds Bounds2 { p_min: Point2 { x: 0, y: 0 }, p_max: Point2 { x: 800, y: 400 } }
Writing image "pbrt_rust.exr" with bounds Bounds2 { p_min: Point2 { x: 0, y: 0 }, p_max: Point2 { x: 800, y: 400 } }

real    24m7.235s
user    83m40.080s
sys     0m4.741s

So, the Rust implementation needs more than 24 minutes on a 4 processor Linux laptop.

wahn commented

@matklad suggested to use heaptrack ...

wahn commented

Here the difference between the C++ code:

heaptrack pbrt

And the Rust counterpart:

heaptrack rs_pbrt

I wonder why allocations are happening in the rendering loop at all? It seems like only scene construction should allocate. I think this is orthogonal to arenas: if you don't allocate, you don't need an arena either

wahn commented

I started to describe the problem here:

https://www.rs-pbrt.org/blog/arena-based-allocation/

wahn commented

Instead of implementing MemoryArena I tried to replace dynamic dispatch with an enum for two traits:

  1. Fresnel trait -> Fresnel enum
  2. Bxdf trait -> Bxdf enum

This did not really improve performance, but it might still be worth to try to do this for other traits ...

did this actually reduced he number of allocations? Obviously it should have, but it”s always helpful to double check

wahn commented
$ more heaptrack_enum.txt 
heaptrack stats:
        allocations:            259771340
        leaked allocations:     15
        temporary allocations:  76797682

$ more heaptrack_trait.txt 
heaptrack stats:
        allocations:            265922570
        leaked allocations:     200
        temporary allocations:  77808309

So looks like those traits were by far not the most large source of allocations?

wahn commented

I think a big junk of allocation will go to SurfaceInteraction because this is dynamically created during rendering when you hit a shape/geometry and used for shading ...

wahn commented

heaptrack

Aha, so I think this is the main problem:

Some(Arc::new(self.clone())),

So, what C++ version does it it pre-allocates all the shapes:

https://github.com/mmp/pbrt-v3/blob/9adaee88487772059b65bc322fb2b03ad033d9fc/src/shapes/triangle.cpp#L106-L108

This allows it to store a raw pointer in the Intersection. I think directly translating that to Rust should change shape: Option<Arc<dyn Shape + Send + Sync>> to shape: Option<&'a dyn Shape + Send + Sync>. This adds a lifetime, but presumably, because the C++ code works, the lifetimes should actually work out

wahn commented

After commit 6db4dac the render time went down to 21m36s. Still too long ...

wahn commented

Commit e7bc14b fixes a bug which effects the acceleration structures (like the BVH). But the render time is still 22m37s. It is more likely to show effect in scenes with heavy geometry. Let's run such a test on another machine (with more threads) ...

wahn commented

This is a long term project and has nothing to do with the anim-bluespheres.pbrt test scene. Closing for now ...