mitsuba-renderer/drjit

In backward the function block_sum() does not support for attached arrays

Theo-Wu opened this issue · 4 comments

I am adding two arrays (different shapes) with manual broadcasting by using dr. repeat() and dr.tile(). Because the function sum does not support along an axis, I use block_sum().
For example:

a= mi.Int([1,2])
b= mi.Float([3,4,5])
a= dr.repeat(a,len(b)) #[1,1,1,2,2,2]
b= dr.tile(b,len(a)) #[3,4,5,3,4,5]
result = dr.block_sum(a*b,len(b)) #[3*1+4*1+5*1+3*2+4*2+5*2]

In forward rendering it performs as expected, but in dr.backward(loss) it throws an exception:

Exception: block_sum_(): not supported for attached arrays!

What's the suggested way to achieve this goal? I'd appreciate it if you can help me with this.

Hi @Theo-Wu

I'm going to assume you meant the following:

import mitsuba as mi
import drjit as dr
mi.set_variant('cuda_ad_rgb')

a = mi.Int([1,2])
b = mi.Float([3,4,5])

a = dr.repeat(a, len(b)) #[1,1,1,2,2,2]
b = dr.tile(b, 2) #[3,4,5,3,4,5]

result = a * b #[3,4,5,6,8,10]
result = dr.block_sum(result, 3) #[12, 24]

You have a few different options:
If the width of the output array matches an existing type you can use dr.ravel():

result = dr.unravel(mi.Point3f, a * b) #[[3.0, 4.0, 5.0], [6.0, 8.0, 10.0]]
result = dr.sum(result) #[12, 24]

If the width doesn't match an existing type, I'm afraid this will require a normal Python loop and some simple gather/scatter operations which is not ideal in terms of performance (depends on loop size), but is at least differentiable.

I encountered the same issue when trying to implement neural network using only dr.jit. Is there any recommended way to implement neural network that are differentiable?

I'll close this as the original question has been answered.


Is there any recommended way to implement neural network that are differentiable?

In general, in its current state, Dr.Jit is not well-suited for this. We do not use all the hardware features that are available to make neural networks efficient. In addition, the system was not particularly built with this in mind. However, it should still be possible. My recommendation would be to do it from C++, the type system isn't as limited as it is from Python -- simply because you can define new types as you need them.

I encountered the same issue when trying to implement neural network using only dr.jit. Is there any recommended way to implement neural network that are differentiable?

I ran into the exact issue but it turns out that you can use dr.scatter_reduce to achieve similar results as dr.block_sum while being differentiable. See mitsuba-renderer/mitsuba3#579 (reply in thread) for example.