Documentation of Array Creation Unclear
robertmaxton42 opened this issue · 7 comments
Right now, it's not immediately obvious from the documentation what making an Array with a given shape
and offset
actually does - whether it allocates product(shape) * dsize
elements and then skips the first offset
, or what I believe to be the actual implementation where it allocates offset + product(shape) * dsize
before skipping. It would also be nice to note whether setting nbytes
zeroes the full array before use - I assume it does but it's always good to be sure.
(While I'm at it, it would also be useful to document the working of connecting a Transformation
outside the tutorial - in particular, it's non-obvious that we both can and must make up some new parameter name output_prime
or whatever, to define the name of the new computation output
(for example).)
Addendum: Apparently, passing an array
object as base
to another array
passes the pointer at which real data actually starts, not the beginning of originally allocated memory - so if the first array had offset 40, say, the viewing "array" would actually have to be declared with offset negative 40 in order to start at the same place.
I actually don't think that's an unreasonable design decision, but it should probably be documented in Thread.array
.
Addendum: Apparently, passing an array object as base to another array passes the pointer at which real data actually starts, not the beginning of originally allocated memory
Actually, it was different for CUDA and OpenCL cases. The behavior you described was happening for CUDA, OpenCL got the beginning of the allocated memory. I had to change both of those to the beginning of the allocated memory, because in the OpenCL case I couldn't find a way to construct a (pointer + offset) from a given buffer (get_sub_region
is supposed to do that, but it just crashes the execution for me).
While I'm at it, it would also be useful to document the working of connecting a Transformation outside the tutorial
Well, the docs say "new or old computation parameters"... Of course, you cannot use the connector name (added a note about that), and if you use an old name, you have to make sure the shapes match, since it is not currently checked...
So, how are the current docs looking?
Ok, closing for now.
Sorry, was going to get to this today. Yeah, it's better! I'd say you might want to mention that you're only talking about the kernel parameters themselves in your Note, since as far as I can tell ${k_param.idxs}$
still handle the offsets for you in raw kernels, but otherwise it looks good!
EDIT: Whoops, also there's a typo in Computation.connect
- "object beloning to tr
, or a string with its name."
Thanks, hopefully it is more clear now.