Typed array matrices
sloisel opened this issue · 9 comments
How is a typed array matrix specified? Is it an Array of typed Arrays? Or is it a single flat typed array? Is it reasonable to take a flat Float64Array A which represents a matrix and create one Float64Array view per row of A? Can numeric.dot() be made to handle such Array[Float64Array] matrices? Would it be useful to have "flat" versions of numeric.dot()? What about numeric.svd() and numeric.eig()?
I'm not sure how to do this within numeric.js, but I've been experimenting with this concept recently. Here are some sketches of how this might work:
https://github.com/mikolalysenko/ndarray
https://github.com/mikolalysenko/ndarray-experiments
I think the approach of using a typed array with custom accessors seems promising, and I've been working on building a generic cache oblivious map/reduce compiler on top of this to implement component-wise operations (though it is not done yet).
For numeric, the most painless path forward is probably to use the array of contiguous typed arrays approach when initializing objects, but it is by no means the best solution. Long term, the better approach is probably to do everything with typed arrays, since it reduce the amount of extra memory by a factor of n^(d-1) and the cost of each look up by O(d) memory accesses. However making these changes in the short term will probably be very painful... :(
Recently, typed arrays are allowed for the vectorized functions such as numeric.add.
It would be logical to add typed array support for numeric.dot.
In order to support typed arrays for the rest of the library, it is more difficult.
Right, but only for 1D typed arrays. What if you have a multidimensional strided array? Keeping track of how everything matches up is not really possible at the moment.
The philosophy right now is to use bare JS types as much as possible so typed arrays would have to be used in that way as well and hence I think the only concern is 1d typed arrays. Striding becomes important when you do matrix multiplication, which is not currently implemented for typed arrays, but it would make sense to implement numeric.dot, numeric.solve, etc... for 1d typed arrays. This hasn't been done yet.
It will be hard to adapt all the higher level codes such as solveLP and uncmin. The only way I was able to imagine doing this was to add typed array support to the type T and then cast everything to a T when you enter these higher-level functions, so that the higher-level functions are polymorphic and able to work on typed and generic arrays.
Thanks,
Sébastien Loisel
That could work. Also, regarding operations on strided arrays, here is a library I put together for doing pointwise/map-reduce operations on arrays with different stridings:
https://github.com/mikolalysenko/cwise
It works pretty well. For arrays with compatible stridings it performs about as well as manually unrolled for loops, and for arrays with incompatible ordering it uses a cache oblivious recursive sweep that comes within a factor of two of the linearly unrolled ordering.
I've been experimenting with hybrid typed arrays: the 1D array is a typed array, and the 2D matrix is an array of typed subarrays. This is the arrays of arrays part of the nice ndarray article.
Here's a simple example: A typed array is created with a specified rows & cols length and filled with consecutive numbers. Then it is wrapped in a javascript array of subarrays, giving the appearance of being a 2D array, but having the backing store, so to speak, of a typed array. This is particularly useful for pixel manipulation in Canvas, and for webGL use.
Although TAs are more limited than JS arrays, there are a few useful capabilities. The set() method is a very fast way to populate a subset of an array, or in this example, rows.
-- Owen
var rows = 3, cols = 5
var ta = new Float32Array(rows*cols);
for (var i = 0; i < ta.length; i++) {
ta[i] = i;
}
var matrix = []
for (var row = 0; row < rows; row++) {
matrix[row] = ta.subarray(row*cols, (row+1)*cols)
}
console.log( matrix[0][3] ) // 3
console.log( matrix[1][2] ) // 7
console.log( ta )
console.log( matrix )
@backspaces: I actually tried this very experiment myself and you can see the results here:
https://github.com/mikolalysenko/ndarray-experiments
And here is the relevant file:
https://github.com/mikolalysenko/ndarray-experiments/blob/master/multi_cont.js
The short story is that it is kind of break even between doing this and arrays of arrays. For some situations arrays of typed array views are faster, while for others they are much slower.
I like very much the cwise.
I also like the Array of typed Arrays -- I see you are careful to allocate once and create views into it; this is better than allocating lots of typed arrays.
@sloisel: Thanks! I'm glad you like it. I actually wrote a few articles on ndarrays and cwise:
http://0fps.wordpress.com/2013/05/22/implementing-multidimensional-arrays-in-javascript/
http://0fps.wordpress.com/2013/05/28/cache-oblivious-array-operations/
http://0fps.wordpress.com/2013/06/01/ndarray-modules/
I haven't yet pushed this idea as far as you've gone with numeric.js and still haven't created any tools for doing dense linear algebra. However I think that for image processing and volume graphics this approach seems promising.