GD06/cuda-convnet2

program will crash because of line 1473 in nvmatrix/src/nvmatrix.cu

Opened this issue · 5 comments

What steps will reproduce the problem?
1. I can reproduce it if I am luck 
2.
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?
state of the art 

Please provide any additional information below.

this is because the tye of cudaTextureObject_t is not a pointer

Original issue reported on code.google.com by ily15283...@gmail.com on 30 Jul 2014 at 2:28

This confuses me. cudaTextureObject_t is an unsigned long long, so the 
comparison with zero should be fine. I'll need more details to reproduce this. 
I've never seen it myself. 

Original comment by akrizhev...@gmail.com on 4 Aug 2014 at 6:40

I can't upload snapshot, so list related code as fellow :



1458 cudaTextureObject_t NVMatrix::getTextureObject() {
1459    if (_texObj == 0) {
1460        assert(isContiguous());
 1461        //size_t memFree, memTotal;
1462 
1463        struct cudaResourceDesc resDesc;
1464        memset(&resDesc, 0, sizeof(resDesc));
1465        resDesc.resType = cudaResourceTypeLinear;
1466        resDesc.res.linear.devPtr = getDevData();
 1467        resDesc.res.linear.sizeInBytes = getNumDataBytes();
1468        resDesc.res.linear.desc = cudaCreateChannelDesc(32, 0, 0, 0, 
cudaChannelFormatKindFloat);
1469        struct cudaTextureDesc texDesc;
1470        memset(&texDesc, 0, sizeof(texDesc));
 1471        checkCudaErrors(cudaCreateTextureObject(&_texObj, &resDesc, &texDesc, NULL));
1472    }
1473    assert(_texObj != 0);
1474    return _texObj;
1475 }



_texObj returned by line 1471 is ok if it is zero, but that will make line 1473 
fail. 

Original comment by ily15283...@gmail.com on 5 Aug 2014 at 1:41

Oh, so you're saying that 0 is a valid value for _texObj that might be set by 
cudaCreateTextureObject. I didn't realize this. I'll have to work around that 
somehow then. Thanks.

Original comment by akrizhev...@gmail.com on 11 Aug 2014 at 6:30

I  think why you using cudaTextureObject_t is because you want to 
utilize readonly cache in GK110. Another way to use the readonly cache 
is using const __restrict__ pointer, such as const float* __restrict__ 
images. that will solve this bug, and resolve the memory amount 
limitation problem of texture and makes code looks better

hope that information will hope.

BTW, I have worked at Baidu Company for six months, my boss is Ren Wu, 
he say you are his friend:).

于 2014/8/12 星期二 2:30, cuda-convnet2@googlecode.com 写道:

Original comment by ily15283...@gmail.com on 12 Aug 2014 at 12:48

Texture memory is (for mysterious reasons) still pretty noticeably faster than 
__restrict__ pointers in the cases where I use it, but I'll keep this in mind, 
thanks. 

Original comment by akrizhev...@gmail.com on 12 Aug 2014 at 6:27