wahn/rs_pbrt

Variable Stack Allocation

wahn opened this issue · 5 comments

wahn commented

See PBRT book ...

> rg -tcpp "ALLOCA\("
core/pbrt.h
91:#define ALLOCA(TYPE, COUNT) (TYPE *) alloca((COUNT) * sizeof(TYPE))

shapes/loopsubdiv.cpp
429:    Point3f *pRing = ALLOCA(Point3f, valence);
459:    Point3f *pRing = ALLOCA(Point3f, valence);

shapes/nurbs.cpp
79:    Homogeneous3 *cpWork = ALLOCA(Homogeneous3, order);
120:    Homogeneous3 *iso = ALLOCA(Homogeneous3, std::max(uOrder, vOrder));

core/film.h
137:        int *ifx = ALLOCA(int, p1.x - p0.x);
143:        int *ify = ALLOCA(int, p1.y - p0.y);

core/reflection.cpp
322:    Float *ak = ALLOCA(Float, bsdfTable.mMax * bsdfTable.nChannels);
543:    Float *ak = ALLOCA(Float, bsdfTable.mMax * bsdfTable.nChannels);
613:    Float *ak = ALLOCA(Float, bsdfTable.mMax);

... and question on stackoverflow.com ...

wahn commented

E.g. FourierBSDF is using ALLOCA:

Spectrum FourierBSDF::f(const Vector3f &wo, const Vector3f &wi) const {                                 
...
    Float *ak = ALLOCA(Float, bsdfTable.mMax * bsdfTable.nChannels);                                    
...
}
Spectrum FourierBSDF::Sample_f(const Vector3f &wo, Vector3f *wi,                                        
                               const Point2f &u, Float *pdf,                                            
                               BxDFType *sampledType) const {                                           
...
    Float *ak = ALLOCA(Float, bsdfTable.mMax * bsdfTable.nChannels);                                    
...
}
Float FourierBSDF::Pdf(const Vector3f &wo, const Vector3f &wi) const {                                  
...
    Float *ak = ALLOCA(Float, bsdfTable.mMax);                                                          
...
}

Example scene, using FourierBSDF, being rendered by the C++ version:

> time ~/builds/pbrt/release/pbrt dof-dragons.pbrt
...
8107.088u 8.688s 17:24.15 777.2%	0+0k 856+2736io 5pf+0w

dof-dragons

wahn commented
> time ~/git/github/rs_pbrt/target/release/rs_pbrt -i dof-dragons_no_exrs.pbrt
pbrt version 0.7.2 [Detected 8 cores]
Copyright (c) 2016-2019 Jan Douglas Bert Walter.
Rust code based on C++ code by Matt Pharr, Greg Humphreys, and Wenzel Jakob.
Film "image"
  "string filename" ["dof-dragons.exr"]
  "integer xresolution" [1000]
  "integer yresolution" [424]
Sampler "sobol"
  "integer pixelsamples" [1024]
reading "bsdfs/roughglass_alpha_0.2.bsdf" returns true
reading "bsdfs/roughgold_alpha_0.2.bsdf" returns true
Rendering with 8 thread(s) ...
1701 / 1701 [========================================================================] 100.00 % 0.87/s  
Writing image "pbrt.png" with bounds Bounds2 { p_min: Point2 { x: 0, y: 0 }, p_max: Point2 { x: 1000, y: 424 } }
15257.524u 80.165s 32:56.53 775.9%	0+0k 5048+704io 22pf+0w

pbrt

wahn commented

Can we use smallvec for Rust?

wahn commented

Using heaptrack before using smallvec:

before

And afterwards:

afterwards

The first and the seventh column is influenced by those two commits (7b7df2e and 60568aa).

wahn commented

Hard to measure any performance improvements though ... Closing for now.