Adjoint solver performance bottleneck
CharlesDove opened this issue · 1 comments
Hi folks, thanks for the great package! While trying out the adjoint solver, I noticed what appears to be a significant performance bottleneck. Specifically, filter_source.py, starting on line 40
self.bf = [
lambda t, i=i: 0
if t > self.T
else (
self.nuttall(t, self.center_frequencies)
/ (self.dt / np.sqrt(2 * np.pi))
)[i]
for i in range(len(self.center_frequencies))
]
produces a list of python lambdas representing sources. This list is then eventually passed into the c++ engine, which calls back to the python lambda at each step. Removing this call seems to yield a ~10x speed improvement. Perhaps this functionality should be pushed into the c++ engine or otherwise compiled?
Hi @CharlesDove, thanks for reaching out! Handling the adjoint sources is rather tricky (as we talk about in our paper). And I think the behavior you're observing is a red herring. Let me try to walk through things:
So this piece of the code simply computes the different basis functions needed to properly compute a broadband source. Evaluating this (that for loop) is actually relatively cheap and only happens once up front. Then, all the sources' spatial profiles are cached in C++, and we call the time profile like any other custom time profile. In other words, this isn't in a "hot loop."
I suspect your speedup is because you are getting rid of the long run times that comes with these narrow basis functions (we talk a lot about this in the paper). But moving this to C++ shouldn't really change anything (unless your forward and adjoint simulations are really really fast, and the overhead is dominated by this?)