Small optimization
nsd4fr opened this issue · 7 comments
You can optimize this even further by adding in just a couple more lines.
for (size_t i=0; i<vector.size(); i++)
evaluates vector.size() on every loop. This essentially tacks on linear search through the vector every time you increment i. Change to the following:
size_t last = vector.size(); for (size_t i=0; i<last; i++)
or even better, to not add any additional lines:
for (auto it=vector.begin(); it!=vector.end(); it++)
I understand that your goal was not to be perfectly optimal, but readable. However, this second suggestion is almost just as readable as the original, while making your program significantly faster, especially as the number of spheres/objects in the scene grows.
Have you measured the performance changes? I tend to be prudent about such optimizations because the copiler is not dumb.
I agree. The compiler is usually pretty good. And by declaring the vector as const in all your functions, you may be allowing the compiler to automatically perform the first optimization I am describing.
I ran some tests with the timer available in this repo. Specifically, timer.cpp and timer.h.
Here are my results:
compiled as
g++ tinyraytracer.cpp timer.cpp -O2
Old denotes using the way it currently is, for (size_t i...
New denotes using the way I suggested, for (auto it = vector.begin(); it !=vector.end(); ...
as well as the appropriate changes in vector access ( vector[i].method()
to it->method()
)
Results:
4 Spheres:
Old: 1.97684s
New: 1.823s
12 Spheres:
Old: 4.789s
New: 4.489s
28 Spheres:
Old: 8.399s
New: 7.586s
Discussion
I admit this is a very minor change, but this change obviously has more impact as the sphere size grows.
Attached is the edited files, with my optimizations turned on. I went ahead and attached your files as well, just for a quick download and compilation.
using .size() on a vector doesn't do a linear search. the size is tracked internally. the main overhead would be that of the function call to retrieve size.. but that should be optimized/inlined by any sane compiler, no?
If you're seeing a speed increase, I would posit that the change to access via the iterator -> is what is giving you an improvement, vs the indexed referencing. Of course the most optimal way would be to take a pointer to the first element, and the last element, and loop while ptr != lastptr. Has the advantage of not pulling in any external iterator support libraries, etc.
Anyway.. don't want to come across negative., but what you describe runs counter to my intuition.. would be interesting to see a deeper dive into this!
you can also preserve the original style by typing for (size_t i=0, end=vector.size(); i < end; i++)
@manthrax Thanks for the perspective! I've recently gotten a book on optimizing code, and one of the optimizations discussed was not using container.size()
type methods because they can add linear run times. I should have checked the spec for std::vector, because you are correct that it is a constant time method. It also makes sense that direct content access via the iterator is responsible for the change.