gorgonia/tensor

Weird behavior when Iterating over tensors

Opened this issue · 3 comments

Whilst working with the tensor.Iterator functionality I found some strange behavior, when using iterator in a for loop like follows:

it = t.Iterator()  // t is a tensor.Dense
for i, err := it.Start(); err == nil; i, err = it.Next() {
	fmt.Printf("i = %v, coord = %v\n", i, it.Coord())
}
  1. When using the iterator as stated above, the 'first' element of the tensor is always last in the iteration. This is not really a big issue if you want to iterate over all elements and order does not matter, but it is weird nonetheless. As an example consider the tensor [1, 2, 3, 4] with shape (2, 2). It will visit the elements in order:
    i = 0, coord = [0 1]
    i = 1, coord = [1 0]
    i = 2, coord = [1 1]
    i = 3, coord = [0 0]  <------------ should be first
    
  2. When iterating over a vector or a scalar, the indices are off by one. Again same tensor, but as a vector (shape (4,)):
    i = 0, coord = [1]
    i = 1, coord = [2]
    i = 2, coord = [3]
    i = 3, coord = [4]  <----------- gives index error when used
    
    The same thing happens with 'scalar-like' tensors, like: [1] with shape (1, 1, 1):
    i = 0, coord = [1 0 0]  <--------- also index error, should be [0 0 0] always for scalars
    
    Interestingly, when the tensor is a vector/scalar View of a tensor that is not a vector/scalar (i.e. obtained by slicing), the issue does not happen.

What I found to work properly is to use the following for loop instead, but I don't think this is the intended way of iterating.

it = t.Iterator()
for it.Reset(); !it.Done(); it.Next() {
	...
}

So, the .Coord() method is meant to be read before use. The docs do mention that .Coord() returns the next coordinate. This is because an Iterator is a data structure that stores the state. So when you call Next or Start you are changing the state.

I know this is really not ideal. We would love some ideas on how to make this more user friendly. One idea that I had earlier in the design was to not only return the i but also the coords. But it turns out for most activities you only need the index.

So, the .Coord() method is meant to be read before use. The docs do mention that .Coord() returns the next coordinate. This is because an Iterator is a data structure that stores the state. So when you call Next or Start you are changing the state.

Ah okay, that is interesting. I guess I must have missed this part of the documentation. Maybe it would be good to then change the example of how to use Iterator (example) to something that is intended?

I know this is really not ideal. We would love some ideas on how to make this more user friendly. One idea that I had earlier in the design was to not only return the i but also the coords. But it turns out for most activities you only need the index.

Perhaps an easy solution would be to add a method which converts the index i to a coordinate instead? Then it would be more an opt-in feature and doesn't hurt those that do not need it. Of course, it is possible to compute this yourself using the Shape but this would be a lot more convenient.

Having some fix for this would be incredibly useful!