[PLAN] Improve performance with dimension compaction and indexer

Question

sonots opened this issue 6 years ago · 0 comments

Element-wise (binary ops) is already done at #64.
But, reduction and others such as store_from are not yet done.

Without this, cumo (and red-chainer) can not compete with cupy (and chainer)

Current performance comparison on k80 machine: