dmlc/mshadow

Question about `broadcast_with_axis`

jermainewang opened this issue · 4 comments

Why broadcast_with_axis is designed to create a tensor of ndim+1? This is a little bit annoying sometimes. For example, if I need to broadcast a [1, 200] tensor to a [100, 200] tensor. I need to first broadcast it to [1, 100, 200] tensor then do a reshape to get rid of the redundant dim. Is there any special concern for this design? I think maybe a keepdim broadcasting will be more convenient (i.e, [1, 200] directly to [100, 200]).

Because reduce with axis eliminates the dim. This is designed to reverse that

Hmm, but broadcasting is different. For example, if I have a (200,) tensor and call broadcast_with_axis(t, 0, 100). I could only get a (200, 100) tensor. What if I want to have a (100, 200) tensor?

I think it can prepend. try either 0 or -1 should do it.
I think it should be 0

I tried broadcast_with_axis on 0, it is append.
And -1 will cause the program to hang.