CERN/TIGRE

GPU module error

h-zhou15 opened this issue · 16 comments

Expected Behavior

I have three Gpus on my computer, the Quadro GP100, Quadro GP100, and Quadro P600. Only the P600 can be used correctly. the other two GPU modules always fail when i use tigre tools. But it is normal to use both Gpus in other toolkits (MATLAB, python ...).

Actual Behavior

When I point the "gpuids = GpuIds('Quadro GP100');". the Ax_mex always reports error.

MATLAB error :

kernel fail
错误使用 Ax_mex
invalid device symbol

出错 Ax (第 71 行)
projections=Ax_mex(img,geo,angles,ptype, gpuids.devices);

出错 d04_SimpleReconstruction (第 37 行)
projections=Ax(head,geo,angles,'Siddon','gpuids',gpuids);

Code to reproduce the problem (If applicable)

demo04_SimpleReconstruction.m

%% Geometry
geo=defaultGeometry('nVoxel',[128;128;128]);                     

%% Load data and generate projections 
% define angles
angles=linspace(0,2*pi,100);
% Load thorax phatom data
head=headPhantom(geo.nVoxel);
% generate projections
gpuids = GpuIds('Quadro GP100');%'Quadro GP100');
projections=Ax(head,geo,angles,'Siddon','gpuids',gpuids);
% add noise
noise_projections=addCTnoise(projections);

Specifications

  • MATLAB/python version: MATLAB R2021b
  • OS: Win 10
  • CUDA version: 11.4

Thanks for the report. This is strange indeed.

Can you print what gpuids contains in the script you showed?

Please run d22_ListGPUs or just getGpuNames and you will find the GPU name to set. I guess it would be "NVIDIA Quadro GP100" or something like that.

Thanks for the report. This is strange indeed.

Can you print what gpuids contains in the script you showed?
3 Gpus:
/
Gpus

Please run d22_ListGPUs or just getGpuNames and you will find the GPU name to set. I guess it would be "NVIDIA Quadro GP100" or something like that.
here is the getGpuNames
listGpuNames
{'Quadro GP100'} {'Quadro GP100'} {'Quadro P600'}

Thanks!

Can you print what gpuids contains in the script you showed?

I meant in the python code, can you print(gpuids). I want to understand what this variable returns in your case when you give it a GPU name.

@h-zhou15 , My environment, which is working as expected, is like yours. The results on Matlab 2022a. FYI:

>> getGpuNames
ans =
  1×3 cell array
    {'NVIDIA GeForce RTX 2080 Ti'}    {'NVIDIA GeForce RTX 2080 Ti'}    {'NVIDIA GeForce GTX 1070'}
>> GpuIds('NVIDIA GeForce RTX 2080 Ti')
ans = 
  GpuIds with properties:

       name: 'NVIDIA GeForce RTX 2080 Ti'
    devices: [0 1]

I am not sure the GPU ID's are the cause of the error invalid device symbol.

Thanks!

Can you print what gpuids contains in the script you showed?

I meant in the python code, can you print(gpuids). I want to understand what this variable returns in your case when you give it a GPU name.

Thanks, when i set the GpuIds like this

gpuids = GpuIds('Quadro GP100');
projections=Ax(head,geo,angles,'Siddon','gpuids',gpuids);

The gpuids returns like this
image
the coresponding gpuids is [0,1] for we have two same gpus on the pc.

At first I also suspected that there were two Gpuids causing the error. I tried to modify the getGpuIds function so that the program selected only one gpu, but it still didn't work.

function gpuids = getGpuIds(gpuname)
    deviceCount = getGpuCount_mex();
    gpuids = int32(0:(deviceCount-1));
    for idx = 1:deviceCount
        name = getGpuName_mex(gpuids(idx));
        if ~strcmp(name, gpuname)
            gpuids(idx) = -1;
        end
    end
    gpuids = gpuids(gpuids>=0);
    
    % We modify here to select only one gpuid. 
    if length(gpuids) > 1
        gpuids = gpuids(end);
    end
end

@h-zhou15 , My environment, which is working as expected, is like yours. The results on Matlab 2022a. FYI:

>> getGpuNames
ans =
  1×3 cell array
    {'NVIDIA GeForce RTX 2080 Ti'}    {'NVIDIA GeForce RTX 2080 Ti'}    {'NVIDIA GeForce GTX 1070'}
>> GpuIds('NVIDIA GeForce RTX 2080 Ti')
ans = 
  GpuIds with properties:

       name: 'NVIDIA GeForce RTX 2080 Ti'
    devices: [0 1]

I am not sure the GPU ID's are the cause of the error invalid device symbol.

@tsadakane I also have test on other pc with different gpu modules like NVIDIA TITAN xp , 1650 . All of these work well except the Quadro GP100.

I am not very familiar with cuda programming, and the current phenomenon seems to indicate that this program does not support Quadro GP100.

@h-zhou15 I never tested the Quadro GP100, but there is no aparent reason for TIGRE to not support it, this is not the cause of the problem. As another test to verify, what if you make:

listGpuNames = getGpuNames();
gpuids = GpuIds(listGpuNames{1});

Does it give the same error?

@h-zhou15 I never tested the Quadro GP100, but there is no aparent reason for TIGRE to not support it, this is not the cause of the problem. As another test to verify, what if you make:

listGpuNames = getGpuNames();
gpuids = GpuIds(listGpuNames{1});

Does it give the same error?

@AnderBiguri I test but it still give the same error

%% Load data and generate projections 
% define angles
angles=linspace(0,2*pi,100);
% Load thorax phatom data
head=headPhantom(geo.nVoxel);
% generate projections
% gpuids = GpuIds('Quadro GP100');
listGpuNames = getGpuNames();
gpuids = GpuIds(listGpuNames{1});
projections=Ax(head,geo,angles,'Siddon','gpuids',gpuids);
% add noise
noise_projections=addCTnoise(projections);
>> d04_SimpleReconstruction
kernel fail 
错误使用 Ax_mex
invalid device symbol

出错 Ax (第 71 行)
projections=Ax_mex(img,geo,angles,ptype, gpuids.devices);

出错 d04_SimpleReconstruction (第 39 行)
projections=Ax(head,geo,angles,'Siddon','gpuids',gpuids);

This could be something else.

You are compiling with one of the xml files, e.g. for MSV 2019, this one: https://github.com/CERN/TIGRE/blob/master/MATLAB/mex_CUDA_win64_MSV2019.xml#L36

The line hilighted has a long string, but it does not have -gencode=arch=compute_60,code=sm_60. Can you add it to it (or to the relevant xml file), Compile.m, and try again?

This could be something else.

You are compiling with one of the xml files, e.g. for MSV 2019, this one: https://github.com/CERN/TIGRE/blob/master/MATLAB/mex_CUDA_win64_MSV2019.xml#L36

The line hilighted has a long string, but it does not have -gencode=arch=compute_60,code=sm_60. Can you add it to it (or to the relevant xml file), Compile.m, and try again?

ok, thanks. I rerun the compile. with the mex_CUDA_win64_MSV2019.xml (in my pc , I installed the Visual Studio 2019)and I add the -gencode=arch=compute_60,code=sm_60 to the xml file. but the gup 'Quadro GP100' is still not work. Same error still exist. But in the compile process, MATLAB gives some warnings. I'm not sure this warning is relevant to my problem.

>> Compile
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
MEX configured to use 'NVIDIA CUDA Compiler' for CUDA language compilation.
Compiling TIGRE source...
This may take a couple of minutes....
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 65) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Ax_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Ax_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Ax_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Ax_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Ax_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Ax_mex.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 66) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Atb_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Atb_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Atb_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Atb_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\Atb_mex.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 67) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\minTV.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\minTV.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\minTV.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\minTV.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 68) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AwminTV.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AwminTV.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AwminTV.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AwminTV.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 69) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\tvDenoise.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\tvDenoise.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\tvDenoise.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\tvDenoise.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 70) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AddNoise.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AddNoise.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AddNoise.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\AddNoise.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 71) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\mexReadXim.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\mexReadXim.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 72) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\getGpuName_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\getGpuName_mex.exp

MEX completed successfully.
Warning: Selected compiler 'NVIDIA CUDA Compiler' is not supported and no other supported compiler was found. For options, visit
https://www.mathworks.com/support/compilers. 
> In Compile (line 73) 
Renamed options file 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64.xml' to 'C:\Users\hao\AppData\Roaming\MathWorks\MATLAB\R2021b\mex_CUDA_win64_backup.xml'.
Building with 'NVIDIA CUDA Compiler'.
������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\getGpuCount_mex.exp

������ H:\hao\Project\TIGRE-master\MATLAB\Mex_files\win64\getGpuCount_mex.exp

MEX completed successfully.
Compilation complete

������ means can not find the file. Some display problem occurs in my matlab.

No, thats OK, this messages are standard, your compilation was successful, meaning this didn't solve it.... I'll think about it a bit more...

No, thats OK, this messages are standard, your compilation was successful, meaning this didn't solve it.... I'll think about it a bit more...

Well, thank you for your help. If you have a new solution, please @me, or send me an email at zhou-h19@mails.tsinghua.edu.cn

@h-zhou15 just to verify, in the previous test, you renamed the file before Compiling, right?

@AnderBiguri yep. According to the given tutorial, I renamed the 'mex_CUDA_win64_MSV2019.xml' to 'mex_CUDA_win64.xml‘. And then recompile it.

Closing it due to inactivity, feel free to open it again if you still have issues