Opencl Intercept layer not working (segmentation fault)
duttasankha opened this issue · 18 comments
Hello
I have build and installed opencl intercept layer in my linux platform with kernel version 4.13.0. However, I couldn’t get the intercept layer to work. I would mention the steps I have taken to make it work in detail hoping to get it work. I am trying to get some metrics out of my opencl application. First, I downloaded whole opencl intercept repo in a folder and followed the cmake instructions as provided in the release mode and I have enabled other flags like ENABLE_CLIPROF, ENABLE_MDAPI, ENABLE_CLILOADER followed by make and make install.
The opencl library created successfully inside the folder where I downloaded the repo. After that I am trying to use it for a particular application rather than doing a global install. So I believed I set the environment properly.
Specifically, the environment setting steps are as follows:
export LD_LIBRARY_PATH=/home/user/Desktop/OpenclIntercept/:$LD_LIBRARY_PATH
export CLI_DLLName=/opt/intel/opencl/SDK/lib64/libOpenCL.so
Setting other environment variables like:
export CLI_CallLogging=1 CLI_DumpProgramSource=1 CLI_DevicePerfCounterCustom=ComputeBasic CLI_DevicePerfCounterTiming=1
After setting all this when I am trying to execute both the cliloader and cliprof on my application I am getting segmentation fault.
CliLoader details:
sudo ./cliloader --debug --call-logging --dump-source /path/to/application/Application_itself
Output:
[cliloader debug] full path to executable is: /home/User/Desktop/opencl-intercept-layer/cliloader/cliloader
[cliloader debug] pProcessName is non-NULL: /cliloader
[cliloader debug] process directory is /home/User/Desktop/opencl-intercept-layer/cliloader
[cliloader debug] New LD_PRELOAD is /home/User/Desktop/opencl-intercept-layer/cliloader/../libOpenCL.so
[cliloader debug] New LD_LIBRARY_PATH is /home/User/Desktop/opencl-intercept-layer/cliloader/..
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
CLIntercept (64-bit) is loading...
CLintercept file location: /home/User/Desktop/opencl-intercept-layer/cliloader/../libOpenCL.so
CLIntercept URL: https://github.com/intel/opencl-intercept-layer
CLIntercept git description: v2.2.1-111-g331ce28
CLIntercept git refspec: refs/heads/master
CLInterecpt git hash: 331ce28
CLIntercept optional features:
cliloader(supported)
cliprof(supported)
kernel overrides(NOT supported)
ITT tracing(NOT supported)
MDAPI(supported)
CLIntercept environment variable prefix: CLI_
CLIntercept config file: clintercept.conf
Trying to load dispatch from: ./real_libOpenCL.so
Couldn't load library from: ./real_libOpenCL.so
Trying to load dispatch from: /usr/lib/x86_64-linux-gnu/libOpenCL.so
Couldn't get exported function pointer to: clSetProgramReleaseCallback
Couldn't get exported function pointer to: clSetProgramSpecializationConstant
... success!
CallLogging is set to a non-default value!
ReportToStderr is set to a non-default value!
DumpProgramSource is set to a non-default value!
Timer Started!
... loading complete.
clGetPlatformIDs
<<<< clGetPlatformIDs -> CL_SUCCESS
Number of Platforms: 2
clGetPlatformIDs
<<<< clGetPlatformIDs -> CL_SUCCESS
clGetDeviceIDs: platform = [ Intel(R) OpenCL HD Graphics ], device_type = CL_DEVICE_TYPE_GPU (4)
<<<< clGetDeviceIDs -> CL_SUCCESS
clGetDeviceIDs: platform = [ Intel(R) OpenCL HD Graphics ], device_type = CL_DEVICE_TYPE_GPU (4)
<<<< clGetDeviceIDs -> CL_SUCCESS
clCreateContext: properties = [ NULL ], num_devices = 1, devices = [ Intel(R) Gen9 HD Graphics NEO (CL_DEVICE_TYPE_GPU) ]
<<<< clCreateContext: returned 0x1c7ee50 -> CL_SUCCESS
Context created successfully
Segmentation fault (core dumped)
CliLoader details:
sudo ./cliprof --debug --verbose /path/to/application/Application_itself
Output:
[cliprof debug] full path to executable is: /home/user/Desktop/opencl-intercept-layer/cliprof/cliprof
[cliprof debug] pProcessName is non-NULL: /cliprof
[cliprof debug] process directory is /home/user/Desktop/opencl-intercept-layer/cliprof
[cliprof debug] New LD_PRELOAD is /home/user/Desktop/opencl-intercept-layer/cliprof/../libOpenCL.so
[cliprof debug] New LD_LIBRARY_PATH is /home/user/Desktop/opencl-intercept-layer/cliprof/..
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
CLIntercept (64-bit) is loading...
CLintercept file location: /home/user/Desktop/opencl-intercept-layer/cliprof/../libOpenCL.so
CLIntercept URL: https://github.com/intel/opencl-intercept-layer
CLIntercept git description: v2.2.1-111-g331ce28
CLIntercept git refspec: refs/heads/master
CLInterecpt git hash: 331ce28
CLIntercept optional features:
cliloader(supported)
cliprof(supported)
kernel overrides(NOT supported)
ITT tracing(NOT supported)
MDAPI(supported)
CLIntercept environment variable prefix: CLI_
CLIntercept config file: clintercept.conf
Trying to load dispatch from: ./real_libOpenCL.so
Couldn't load library from: ./real_libOpenCL.so
Trying to load dispatch from: /usr/lib/x86_64-linux-gnu/libOpenCL.so
Couldn't get exported function pointer to: clSetProgramReleaseCallback
Couldn't get exported function pointer to: clSetProgramSpecializationConstant
... success!
ReportToStderr is set to a non-default value!
DevicePerformanceTiming is set to a non-default value!
Timer Started!
... loading complete.
Number of Platforms: 2
Context created successfully
Segmentation fault (core dumped)
Also when I am trying to launch the application after setting up environment variables I am getting a segmentation fault. I don’t know if the directory CLIntercept_Dump would be created automatically, I created it manually and I didn’t see any dump inside it even after enabling logging and dump. So I am not sure how to make this thing work and what I should do. I have one more question as to how can I use the control apis as mentioned in here.
I would really appreciate if someone could help me out with the errors.
Thank you.
Just a wild idea. If you haven't specified -DCMAKE_BUILD_TYPE=Release, the default value for the build type must be debug. Thus, you can run cliloader/cliprof under gdb and get a backtrace.
Also, what is your linux distro (Ubuntu 18.04, Debian 9.7, etc)? Do you have latest and greatest drivers? Have you tried running another application (Like conformance test basic or some kind of hello world)?
I am asking because I use the intercept layer on daily basis both on Windows and Linux and I don't see any crashes.
Regarding your control api question. You have 2 options for linux. Either create a ~/clintercept.conf file or use environmental variables. As I've mentioned, I use intercept layer on daily basis, so I stick with the first option. So, you have to put your options to the file like this:
DumpProgramSource = 1
LogToFile = 1
AppendPid = 1
CallLogging = 1
AFAIK, for CallLogging to save logs to file you must set LogToFile.
If you don't want to create a config file, you can specify the env variables like this:
export CLI_LogToFile = 1
export CLI_CallLogging = 1
Hi, thank you for the very detailed bug report! I'm not 100% sure what the problem is, but here are a few things to try:
- The environment variables are case sensitive, so it's not picking up your "real" libOpenCL.so correctly. You can fix this with (note the lowercase 'L' in 'DllName'):
export CLI_DllName=/opt/intel/opencl/SDK/lib64/libOpenCL.so
It still looks like it's picking up a system libOpenCL.so in /usr/lib/x86_64-linux-gnu/libOpenCL.so
and finding the OpenCL symbols correctly, so this may not be necessary, but it's still something I'd check.
-
Do you need to run your program via sudo? If so, you'll probably want to pass the
-E
option to preserve your environment variables. Admittedly I haven't done a lot of testing with cliloader and sudo, so I'd recommend running as a regular user if possible (the intercept layer doe not require root access). -
If I look at your log:
>>>> clCreateContext: properties = [ NULL ], num_devices = 1, devices = [ Intel(R) Gen9 HD Graphics NEO (CL_DEVICE_TYPE_GPU) ]
<<<< clCreateContext: returned 0x1c7ee50 -> CL_SUCCESS
Context created successfully
Segmentation fault (core dumped)
It's hard to tell if the segfault is occurring in the intercept layer or in your app. The output "Context created successfully" is coming from the application. Do you know what it is trying to do after this? Can you try running in a debugger to see where the segfault is occurring?
- Could you try running a different app, such as
clinfo
?
@alexbatashev thank you so much. I would address your queries one by one
If you haven't specified -DCMAKE_BUILD_TYPE=Release, the default value for the build type must be debug. Thus, you can run cliloader/cliprof under gdb and get a backtrace.
I have built it in the release mode. I will keep your suggestion in mind and try to re-make it as debug and run it under gdb.
Also, what is your linux distro (Ubuntu 18.04, Debian 9.7, etc)?
I am using ubuntu 16.04 with kernel version 4.13
If you don't want to create a config file, you can specify the env variables like this:
export CLI_LogToFile = 1
export CLI_CallLogging = 1
I don't want to create a .conf file and so I have specified several environment variables that I have mentioned in the original post.
Thank you to you as well ben for getting back to me.
The environment variables are case sensitive, so it's not picking up your "real" libOpenCL.so correctly. You can fix this with (note the lowercase 'L' in 'DllName'):
export CLI_DllName=/opt/intel/opencl/SDK/lib64/libOpenCL.so
I haven't tried that but I will do it. However, /opt/intel/opencl/SDK/lib64/libOpenCL.so is symbolically linked with /usr/lib/x86_64-linux-gnu/libOpenCL.so and so I am not sure if that would make any difference. However would do it and report here.
Do you need to run your program via sudo? If so, you'll probably want to pass the -E option to preserve your environment variables. Admittedly I haven't done a lot of testing with cliloader and sudo, so I'd recommend running as a regular user if possible (the intercept layer doe not require root access).
I might not require sudo access for the intercept layer but need sudo to run my application. So I am running with sudo until now. But I haven't specified the -E option to retain the environment variables. So I haven't tried that either. That is a good point and I will try that as well.
>>>> clCreateContext: properties = [ NULL ], num_devices = 1, devices = [ Intel(R) Gen9 HD Graphics NEO (CL_DEVICE_TYPE_GPU) ]
<<<< clCreateContext: returned 0x1c7ee50 -> CL_SUCCESS
Context created successfully
Segmentation fault (core dumped)
It's hard to tell if the segfault is occurring in the intercept layer or in your app. The output "Context created successfully" is coming from the application. Do you know what it is trying to do after this? Can you try running in a debugger to see where the segfault is occurring?
FILE *fpHandleGPU;
err = clGetPlatformIDs(0,NULL,&num_platform);
platform = (cl_platform_id *)malloc(num_platform*sizeof(cl_platform_id));
err = clGetPlatformIDs(num_platform, platform, NULL);
/* Determine number of connected devices */
err = clGetDeviceIDs(platform[0], CL_DEVICE_TYPE_GPU,0,NULL,&num_devices);
/* Access connected devices */
devices = (cl_device_id*) malloc(sizeof(cl_device_id) * num_devices);
err = clGetDeviceIDs(platform[0], CL_DEVICE_TYPE_GPU,num_devices,devices,NULL);
ctx = clCreateContext(NULL,1,&devices[0],NULL,NULL,&err);
printf("Context created successfully\n");
fpHandleGPU = fopen("oenclkernel.cl","r");
fseek(fpHandleGPU,0,SEEK_END);
////THE SEGFAULT IS HAPPENING AT THIS POINT WHILE EXECUTING fseek////
progSize=ftell(fpHandleGPU);
rewind(fpHandleGPU);
progSource = (char *)malloc(progSize*sizeof(char)+1);
progSource[progSize] = '\0';
fread(progSource,sizeof(char),progSize,fpHandleGPU);
fclose(fpHandleGPU);
program = clCreateProgramWithSource(ctx,1,(const char **)&progSource,&progSize,&err);
free(progSource);
char *options = (char *)malloc(100*sizeof(char));
strcpy(options,"-cl-opt-disable -cl-std=CL2.0"); //
err = clBuildProgram(program,1,&devices[0],options,NULL,NULL);
if(err!=CL_SUCCESS){
clGetProgramBuildInfo(program,devices[0],CL_PROGRAM_BUILD_LOG,0,NULL,&logSize);
log=(char *)malloc(logSize+1);
log[logSize]='\0';
clGetProgramBuildInfo(program,devices[0],CL_PROGRAM_BUILD_LOG,logSize+1,log,NULL);
printf("\n==========ERROR=========\n%s\n=======================\n",log);
free(log);
exit(EXIT_FAILURE);
}
else
printf("Program build successfully\n");
cQ = clCreateCommandQueue(ctx,devices[0],CL_QUEUE_PROFILING_ENABLE,&err);
Could you try running a different app, such as clinfo?
My clinfo runs fine normally. I haven't tried it with the intercept layer. But all my applications that runs fine normally doesn't run with the intercept layer. But I will try clinfo with intercept layer as well.
So I made the following changes
Changed "DLLName" to lowercase "DllName" and change it to CLI_DllName=/opt/intel/opencl/SDK/lib64/libOpenCL.so
However, I was still getting a segmentation fault and I then changed it to libopencl.so inside opencl intercept library. So the change is like
CLI_DllName=/home/user/Desktop/opencl-intercept-layer/libOpenCL.so
Then I set the interested environment and after doing env | grep CL_ the output is like:
CLI_DllName=/home/duttasankha/Desktop/SANKHA_ALL/INTEL_OPENCL_INTERCEPT_LAYER/opencl-intercept-layer/libOpenCL.so
CLI_CallLogging=1
CLI_DevicePerfCounterCustom=ComputeBasic
CLI_DevicePerfCounterTiming=1
CLI_DumpProgramSource=1
Then when I try to run the application in a different folder I am getting the following output
sudo -E ./cliloader --debug --mdapi-counters --dump-source --call-logging -ddiag /home/user/Desktop/App Dir/App_itself
[cliloader debug] full path to executable is: /home/user/Desktop/opencl-intercept-layer/cliloader/cliloader
[cliloader debug] pProcessName is non-NULL: /cliloader
[cliloader debug] process directory is /home/user/Desktop/opencl-intercept-layer/cliloader
[cliloader debug] New LD_PRELOAD is /home/user/Desktop/opencl-intercept-layer/cliloader/../libOpenCL.so
[cliloader debug] New LD_LIBRARY_PATH is /home/user/Desktop/opencl-intercept-layer/cliloader/..
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
CLIntercept (64-bit) is loading...
CLintercept file location: /home/user/Desktop/opencl-intercept-layer/cliloader/../libOpenCL.so
CLIntercept URL: https://github.com/intel/opencl-intercept-layer
CLIntercept git description: v2.2.1-111-g331ce28
CLIntercept git refspec: refs/heads/master
CLInterecpt git hash: 331ce28
CLIntercept optional features:
cliloader(supported)
cliprof(supported)
kernel overrides(NOT supported)
ITT tracing(NOT supported)
MDAPI(supported)
CLIntercept environment variable prefix: CLI_
CLIntercept config file: clintercept.conf
Read DLL name from user parameters: /home/user/Desktop/opencl-intercept-layer/libOpenCL.so
Trying to load dispatch from: /home/user/Desktop/opencl-intercept-layer/libOpenCL.so
... success!
CallLogging is set to a non-default value!
ContextCallbackLogging is set to a non-default value!
ContextHintLevel is set to a non-default value!
ReportToStderr is set to a non-default value!
DevicePerfCounterCustom is set to a non-default value!
DevicePerfCounterTiming is set to a non-default value!
DumpProgramSource is set to a non-default value!
Metric Discovery failed to initialize.
Timer Started!
... loading complete.
clGetPlatformIDs
clGetPlatformIDs
clGetPlatformIDs
.
.
.
clGetPlatformIDs
clGetPlatformIDs
clGetPlatformIDs
clGetPlatformIDs
clGetPlatformIDs
clGetPlatformIDs
Segmentation fault (core dumped)
I can see that Metric Discovery failed to initialize.
However, I am not sure how to initialize it. I have enable MD_API while building. So I am not entirely sure what I should do to get the metrics. Thanks.
If you set DllName
to the intercept layer libOpenCL.so then it basically creates an infinite loop where the intercept layer loads the intercept layer which loads the intercept layer, etc... so don't do that. 😃
You want to set it to your "real" libOpenCL.so, if it's not found automatically. Since it's finding most of the OpenCL entry points and some OpenCL calls are succeeding I don't think this is the problem. Let's try a few other things first.
-
Can you try a different app such as
clinfo
? I'm interested to see if this works, both with and withoutsudo
. -
Can you try without
cliloader
, as a sanity check? Either with the "targeted usage" or "global install" instructions? -
Can you confirm that the file is opening correctly (
fpHandleGPU = fopen("oenclkernel.cl","r");
)?
You mentioned that you are using kernel 4.13.0. Is there anything else noteworthy about your Linux install?
If you set DllName to the intercept layer libOpenCL.so then it basically creates an infinite loop where the intercept layer loads the intercept layer which loads the intercept layer, etc... so don't do that.
Oh Okay..but if I set the DllName to "real" libOpenCL.so the output is the same as it was before. So I also want to clarify that by "real" libOpenCL.so you mean the library located at /opt/intel/opencl/SDK/lib64/libOpenCL.so? So the output if I set the DllName to the above path and the other environment variables as I mentioned above is:
Can you try a different app such as clinfo? I'm interested to see if this works, both with and without sudo.
I have tried clinfo. It executed properly and generated 3 files in the /CLIntercept_Dump/clinfo folder.
This are CLI_0000_0585D671_source.cl CLI_0001_0585D671_source.cl clintercept_report.txt
I executed it like sudo -E ./cliloader --debug --call-logging --dump-source --mdapi-counters clinfo
However, the clintercept_report.txt came out empty. Before it was not working and was giving segmentation fault. I guess CLI_DllName and preserving the environment variable was the issue.
Can you try without cliloader, as a sanity check? Either with the "targeted usage" or "global install" instructions?
Can you confirm that the file is opening correctly (fpHandleGPU = fopen("oenclkernel.cl","r");)?
So there are several things that I tried and finally did something which succeeded that I would like to describe. So I have source files in another folder which I copied to the cliloader folder and built it and executed with and without the cliloader.
sudo -E ./cliloader --debug --call-logging --dump-source --mdapi-counters test
Output:
[cliloader debug] full path to executable is: /home/USER/Desktop/opencl-intercept-layer/cliloader/cliloader
[cliloader debug] pProcessName is non-NULL: /cliloader
[cliloader debug] process directory is /home/USER/Desktop/opencl-intercept-layer/cliloader
[cliloader debug] New LD_PRELOAD is /home/USER/Desktop/opencl-intercept-layer/cliloader/../libOpenCL.so
[cliloader debug] New LD_LIBRARY_PATH is /home/USER/Desktop/opencl-intercept-layer/cliloader/..
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
CLIntercept (64-bit) is loading...
CLintercept file location: /home/USER/Desktop/opencl-intercept-layer/cliloader/../libOpenCL.so
CLIntercept URL: https://github.com/intel/opencl-intercept-layer
CLIntercept git description: v2.2.1-111-g331ce28
CLIntercept git refspec: refs/heads/master
CLInterecpt git hash: 331ce28ff998ea83620317aa1dfc64b0c06f6587
CLIntercept optional features:
cliloader(supported)
cliprof(supported)
kernel overrides(NOT supported)
ITT tracing(NOT supported)
MDAPI(supported)
CLIntercept environment variable prefix: CLI_
CLIntercept config file: clintercept.conf
Trying to load dispatch from: ./real_libOpenCL.so
Couldn't load library from: ./real_libOpenCL.so
Trying to load dispatch from: /usr/lib/x86_64-linux-gnu/libOpenCL.so
Couldn't get exported function pointer to: clSetProgramReleaseCallback
Couldn't get exported function pointer to: clSetProgramSpecializationConstant
... success!
CallLogging is set to a non-default value!
ReportToStderr is set to a non-default value!
DevicePerfCounterCustom is set to a non-default value!
DevicePerfCounterTiming is set to a non-default value!
DumpProgramSource is set to a non-default value!
Metric Discovery failed to initialize.
Timer Started!
... loading complete.
And when I just executed the binary without the cliloader it executed as expected.
It executed this time without any error and generated a folder inside the /CLIntercept_Dump folder (/CLIntercept_Dump/app_name) and generated clintercept_report.txt file only. But the file is empty both the time. There is no metric output inside the files. However, the flags while executing the cliloader are provided as required and environment variables are set. env | grep CLI_
gives
CLI_DLLName=/opt/intel/opencl/SDK/lib64/libOpenCL.so
CLI_CallLogging=1
CLI_DevicePerfCounterCustom=ComputeBasic
CLI_DevicePerfCounterTiming=1
CLI_DumpProgramSource=1
In the cliloader output we can see Metric Discovery failed to initialize.
So as the application is running now as expected, I guess there are some other issues that might be better tracked....
And when I just executed the binary without the cliloader it executed as expected.
It executed this time without any error and generated a folder inside the /CLIntercept_Dump folder (/CLIntercept_Dump/app_name) and generated clintercept_report.txt file only. But the file is empty both the time.
This is actually what I faced recently. However, I think this is related to my OpenCL setup. I can't reproduce the issue on another machine. Anyway, you can grab an older commit and give it a try. If you happen to bisect the faulty commit, please let us know, this would help a lot.
@bashbaug I will investigate it a little bit further tomorrow and report you back if the issue is not resolved with simple reboot.
In the cliloader output we can see Metric Discovery failed to initialize.
So as the application is running now as expected, I guess there are some other issues that might be better tracked....
Do you really need MDAPI? Try to disable it (i.e. don't set ENABLE_MDAPI). It is marked for internal use only. Also, you don't need ENABLE_CLIPROF. The new cliloader does the same.
@alexbatashev
Anyway, you can grab an older commit and give it a try.
Can you give me which previous commit of intercept library that I can use?
Do you really need MDAPI? Try to disable it (i.e. don't set ENABLE_MDAPI). It is marked for internal use only. Also, you don't need ENABLE_CLIPROF. The new cliloader does the same.
So my purpose is to get different metrics from my opencl application as mentioned in here. I don't have exactly which metrics I am going to use but mostly would be related to L3, LLC and EU. So previously I tried to use metrics discovery library. But then I thought of using th en Opencl intercept layer. So I thought enabling ENABLE_MDAPI would give the metrics. Would I be able to get the metrices I require by not enabling ENABLE_MDAPI?
@duttasankha this commit does a good job for me on the machine I see a familiar issue. This is most likely not the last known good commit, I just had it pre-built.
Various metrics can be get with vTune Amplifier. It has OpenCL support. You can get it for free with System Studio. If you are a student/teacher you can also request educational license for Parallels Studio. Personally, I don’t come across GPU that often. So, @bashbaug can tell you more about GPU and its metrics.
@alexbatashev
Thank you for providing with the commit. Actually, I started with the Vtune to get my GPU metrics but with Vtune nothing was working with respect to the GPU which led me to metrics discovery and which in turn led to intercept library. There are several things I tried with the Vtune and followed the installation instructions as closely as I could (like installing the driver and changing of the kernel configuration) and still it didn't work. Also Vtune have got several components which I don't need and Vtune require documentation study to get the interested metrices which I would like to avoid. So I was thinking that the intercept library would be easier to use to get the required metrics. So if I could get the same metrices that I could get through Vtune through the intercept library then I would like to stick with this.
Metric Discovery failed to initialize.
I've heard from others that MDAPI through the Intercept Layer isn't working on Linux but I haven't had a chance to debug it yet. Would you mind filing an issue specifically for this?
I've verified that MDAPI is working on Windows recently, if that's an option for you.
Is there any additional action required for this segfault issue?
I'll update build instructions since ENABLE_MDAPI
is no longer for internal use only, at the very least. Thanks!
@bashbaug
Using MDAPI for windows is not much an option in my case. I would create a separate issue regarding the MDAPI through intercept layer for Linux. So I was wondering is there a way to get the GPU metrices other than using the Vtune? Or Vtune is the only option to get GPU metrices? I also tried to get the MD library to use but there is no proper usage documentation and so I am not sure if it can be used independently. So how I can get GPU metrices?
@bashbaug I checked one more time. Although I set my build to Win64, cmake gui generated 32 bit version for some reason. I don't know why. Switching to 64 bit helped me.
@duttasankha vTune is the preferred way to analyze apps performance. As far as I know, the instruction you follow is not related to OpenCL. Basically, you just start vTune, create new project, select an app and choose OpenCL on the right panel and run the test. This guide also says you need to set ENABLE_JITPROFILING=1 and CL_CONFIG_USE_VTUNE=True. This is true for OpenCL CPU runtime. However, looks like GPU doesn't need it. Or you can wait until @bashbaug updates the instructions.
@alexbatashev
Thanks to you and bashbaug for guiding me through the problem. I really appreciate it. So this is the problem with Vtune that I have faced. There is no single installation instructions that would get the Vtune to work and get the GPU metrices. There are disjointed instructions from different sources and I kind of got tired of trying different instructions and still not getting the thing to work. That actually demotivated me to use the Vtune. If there is a single source where the instructions are provided for different platform configurations then that would have been best. I will wait for @bashbaug for further instructions before I again go back to Vtune.
Hi @duttasankha , thank you for your patience. I've been in contact with the driver team and unfortunately the MDAPI support required by the intercept layer is not currently enabled in the NEO open source Linux driver. Could you please file an issue in their github so the MDAPI request gets on their radar?
https://github.com/intel/compute-runtime/issues
Short-term, I'll definitely update the documentation for the intercept layer so it is clear that this restriction exists. Longer-term, I'll also investigate supporting MDAPI time based sampling via the intercept layer, which is how VTune collects MDAPI data and does not require Linux driver support. Sound reasonable?
I wanted to provide a small update regarding this issue. Work is underway to enable MDAPI via the Linux driver. See the latest comment here: intel/compute-runtime#182 (comment).
I've also made some progress enabling MDAPI time based sampling, but it's not quite working yet and I haven't been able to devote as much time to it as I would have liked to. Still, I'm optimistic that it's not too far off.
Thanks!
Hi @duttasankha, I've added support for MDAPI time based sampling to the intercept layer - thank you for your patience and sorry for the (very long) delay. I've also added a comprehensive doc describing what MDAPI support is available and how to use it, which I hope is helpful.
Since I've made all of the changes I am planning to make I think I am going to close this issue, but feel free to re-open it (or open a new issue) if something still appears to be missing.
Also, I'd encourage you to follow intel/compute-runtime#182 (comment) to track progress enabling event profiling on Linux, if you haven't already. As soon as this is available I will test that it works using the intercept layer.
Thanks again!
Hi @duttasankha , I'm happy to say that MDAPI event based sampling is working on Linux now, too. If you want to give this a try please grab the very latest OpenCL drivers here and a small workaround I just added to the intercept layer (#133). Thanks!