[System-Error] application hang when launching a simple example (HelloWorld)
sunneo opened this issue · 6 comments
https://github.com/rsnemmen/OpenCL-examples/tree/master/Hello_World
To make this program launch,
I change the SIZE to 16
and let local become 4 because the workgroupSize 12 cannot divide 1024
after build with gcc -o hello.exe hello.c -lOpenCL
the test program stuck at launch kernel.
[VC4CL]( hello.exe): API call: void* clGetExtensionFunctionAddressForPlatform(cl_platform_id 0x5da5e4, const char* "clIcdGetPlatformIDsKHR")
[VC4CL]( hello.exe): get extension function address: clIcdGetPlatformIDsKHR
[VC4CL]( hello.exe): API call: void* clGetExtensionFunctionAddressForPlatform(cl_platform_id 0x5da5e4, const char* "clGetPlatformInfo")
[VC4CL]( hello.exe): get extension function address: clGetPlatformInfo
[VC4CL]( hello.exe): API call: cl_int clIcdGetPlatformIDsKHR(cl_uint 0, cl_platform_id* 0, cl_uint* 0xbee72d48)
[VC4CL]( hello.exe): API call: cl_int clIcdGetPlatformIDsKHR(cl_uint 1, cl_platform_id* 0x5d56b8, cl_uint* 0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x5da5e4, cl_platform_info 2308, size_t 0, void* 0, size_t* 0xbee72ce0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x5da5e4, cl_platform_info 2308, size_t 226, void* 0x5e6458, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x5da5e4, cl_platform_info 2336, size_t 0, void* 0, size_t* 0xbee72ce0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x5da5e4, cl_platform_info 2336, size_t 6, void* 0x5da630, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clGetDeviceIDs(cl_platform_id 0x5da5e4, cl_device_type 4, cl_uint 1, cl_device_id* 0xbee73634, cl_uint* 0)
[VC4CL]( hello.exe): API call: cl_context clCreateContext(const cl_context_properties* 0, cl_uint 1, const cl_device_id* 0xbee73634, void(CL_CALLBACK*)(const char* errinfo, const void* private_info, size_t cb, void* user_data) 0xbee72d18, void* 0, cl_int* 0xbee75640)
[VC4CL]( hello.exe): Tracking live-time of object: 0x5e587c (cl_context)
[VC4CL]( hello.exe): API call: cl_command_queue clCreateCommandQueue(cl_context 0x5e587c, cl_device_id 0x5da5f8, cl_command_queue_properties 0, cl_int* 0xbee75640)
[VC4CL]( hello.exe): Starting queue handler thread...
[VC4CL]( hello.exe): Tracking live-time of object: 0x5e441c (cl_command_queue)
[VC4CL]( hello.exe): API call: cl_program clCreateProgramWithSource(cl_context 0x5e587c, cl_uint 1, const char** 0x22080, const size_t* 0, cl_int* 0xbee75640)
[VC4CL]( hello.exe): Tracking live-time of object: 0x5a7744 (cl_program)
[VC4CL]( hello.exe): API call: cl_int clBuildProgram(cl_program 0x5a7744, cl_uint 0, const cl_device_id* 0, const char* (null), void(CL_CALLBACK*)(cl_program program, void* user_data) 0xbee72e00, void* 0)
[VC4CL]( hello.exe): Precompiling source with:
Dumping program sources to /tmp/vc4cl-source-386839851.cl
[VC4CL]( hello.exe): Dumping program IR to /tmp/vc4cl-ir-771476364.ll
[VC4CL]( hello.exe): Precompilation complete with status: 0
[VC4CL]( hello.exe): [VC4CL] base=0x3fc00000, mem=0xb6f54000
[VC4CL]( hello.exe): [VC4CL] V3D base: 0xb6f54000
[VC4CL]( hello.exe): Compiling source with:
[VC4CL]( hello.exe): Compilation complete with status: 0
Dumping program binaries to /tmp/vc4cl-binary-942724790.bin
[VC4CL]( hello.exe): API call: cl_kernel clCreateKernel(cl_program 0x5a7744, const char* "square", cl_int* 0xbee75640)
[VC4CL]( hello.exe): Tracking live-time of object: 0x5a8324 (cl_kernel)
[VC4CL]( hello.exe): API call: cl_mem clCreateBuffer(cl_context 0x5e587c, cl_mem_flags 4, size_t 4096, void* 0, cl_int* 0)
[VC4CL]( hello.exe): [VC4CL] Mailbox file descriptor opened: 4
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00030012 00000008 00000004 00000001 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00030012 00000008 80000004 80000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00010006 00000008 00000000 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00010006 00000008 80000008 38000000 08000000 00000000
[VC4CL]( hello.exe): Mailbox request: succeeded
[VC4CL]( hello.exe): Tracking live-time of object: 0x5da69c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00001000 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000024 80000000 0003000c 0000000c 80000004 00000004 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000d 00000008 00000004 00000004 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000d 00000008 80000004 be8f8000 00000000 00000000
[VC4CL]( hello.exe): [VC4CL] base=0x3e8f8000, mem=0xb6f53000
[VC4CL]( hello.exe): Allocated 4096 bytes of buffer: handle 4, device address 0xbe8f8000, host address 0xb6f53000
[VC4CL]( hello.exe): API call: cl_mem clCreateBuffer(cl_context 0x5e587c, cl_mem_flags 2, size_t 4096, void* 0, cl_int* 0)
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00010006 00000008 00000000 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00010006 00000008 80000008 38000000 08000000 00000000
[VC4CL]( hello.exe): Mailbox request: succeeded
[VC4CL]( hello.exe): Tracking live-time of object: 0x5da97c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00001000 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000024 80000000 0003000c 0000000c 80000004 00000014 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000d 00000008 00000004 00000014 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000d 00000008 80000004 be8f4000 00000000 00000000
[VC4CL]( hello.exe): [VC4CL] base=0x3e8f4000, mem=0xb6f52000
[VC4CL]( hello.exe): Allocated 4096 bytes of buffer: handle 20, device address 0xbe8f4000, host address 0xb6f52000
[VC4CL]( hello.exe): API call: cl_int clEnqueueWriteBuffer(cl_command_queue 0x5e441c, cl_mem 0x5da69c, cl_bool 1, size_t 0, size_t 4096, void* 0xbee74640, cl_uint 0, const cl_event* 0, cl_event* 0)
[VC4CL]( hello.exe): Tracking live-time of object: 0x5d08cc (cl_event)
[VC4CL]( hello.exe): Releasing live-time of object: 0x5d08cc (cl_event)
[VC4CL]( hello.exe): API call: cl_int clSetKernelArg(cl_kernel 0x5a8324, cl_uint 0, size_t 4, const void* 0xbee73630)
[VC4CL]( hello.exe): Set kernel arg 0 for kernel 'square' to 0xbee73630 (6137500) with size 4
[VC4CL]( hello.exe): Kernel arg 0 for kernel 'square' is float* 'input' with size 4
[VC4CL]( hello.exe): Setting kernel-argument 0 to pointer 0x0x5da690
[VC4CL]( hello.exe): API call: cl_int clSetKernelArg(cl_kernel 0x5a8324, cl_uint 1, size_t 4, const void* 0xbee7362c)
[VC4CL]( hello.exe): Set kernel arg 1 for kernel 'square' to 0xbee7362c (6138236) with size 4
[VC4CL]( hello.exe): Kernel arg 1 for kernel 'square' is float* 'output' with size 4
[VC4CL]( hello.exe): Setting kernel-argument 1 to pointer 0x0x5da970
[VC4CL]( hello.exe): API call: cl_int clSetKernelArg(cl_kernel 0x5a8324, cl_uint 2, size_t 4, const void* 0xbee73628)
[VC4CL]( hello.exe): Set kernel arg 2 for kernel 'square' to 0xbee73628 (1024) with size 4
[VC4CL]( hello.exe): Kernel arg 2 for kernel 'square' is uint 'count' with size 4
[VC4CL]( hello.exe): Setting kernel-argument 2 to scalar 1024
[VC4CL]( hello.exe): API call: cl_int clGetKernelWorkGroupInfo(cl_kernel 0x5a8324, cl_device_id 0x5da5f8, cl_kernel_work_group_info 4528, size_t 4, void* 0xbee73638, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clEnqueueNDRangeKernel(cl_command_queue 0x5e441c, cl_kernel 0x5a8324, cl_uint 1, const size_t* 0, const size_t* 0xbee7363c, const size_t* 0xbee73638, cl_uint 0, const cl_event* 0, cl_event* 0)
[VC4CL]( hello.exe): Tracking live-time of object: 0x5d08cc (cl_event)
[VC4CL]( hello.exe): API call: cl_int clFinish(cl_command_queue 0x5e441c)
[VC4CL](VC4CL Queue Han): Running kernel 'square' with 341 instructions...
Local sizes: 4 1 1 -> 4 QPUs
Global sizes: 1024 1 1 -> 256 work-groups (all at once)
[VC4CL](VC4CL Queue Han): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00001000 00001000 0000000c 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer after: 00000024 80000000 0003000c 0000000c 80000004 0000000f 00001000 0000000c 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer before: 00000020 00000000 0003000d 00000008 00000004 0000000f 00000000 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer after: 00000020 80000000 0003000d 00000008 80000004 be8f2000 00000000 00000000
[VC4CL](VC4CL Queue Han): [VC4CL] base=0x3e8f2000, mem=0xb6f51000
[VC4CL](VC4CL Queue Han): Allocated 4096 bytes of buffer: handle 15, device address 0xbe8f2000, host address 0xb6f51000
[VC4CL](VC4CL Queue Han): Reserving space for 12 stack-frames of 0 bytes each
[VC4CL](VC4CL Queue Han): Copied 2728 bytes of kernel code to device buffer
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 0(1024), 0(1), 0(1)
Local IDs (sizes): 0(4), 0(1), 0(1)
Group IDs (sizes): 0(256), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe8f8000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe8f4000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 1024
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 1(1024), 0(1), 0(1)
Local IDs (sizes): 1(4), 0(1), 0(1)
Group IDs (sizes): 0(256), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe8f8000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe8f4000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 1024
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 2(1024), 0(1), 0(1)
Local IDs (sizes): 2(4), 0(1), 0(1)
Group IDs (sizes): 0(256), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe8f8000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe8f4000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 1024
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 3(1024), 0(1), 0(1)
Local IDs (sizes): 3(4), 0(1), 0(1)
Group IDs (sizes): 0(256), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe8f8000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe8f4000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 1024
[VC4CL](VC4CL Queue Han): 10 parameters set.
[VC4CL](VC4CL Queue Han): Dumping kernel buffer to /tmp/vc4cl-dump-square-1833488263.bin
[VC4CL](VC4CL Queue Han): Running work-group 0, 0, 0
^C
[VC4CL]( hello.exe): API call: void* clGetExtensionFunctionAddressForPlatform(cl_platform_id 0x1ec5e4, const char* "clIcdGetPlatformIDsKHR")
[VC4CL]( hello.exe): get extension function address: clIcdGetPlatformIDsKHR
[VC4CL]( hello.exe): API call: void* clGetExtensionFunctionAddressForPlatform(cl_platform_id 0x1ec5e4, const char* "clGetPlatformInfo")
[VC4CL]( hello.exe): get extension function address: clGetPlatformInfo
[VC4CL]( hello.exe): API call: cl_int clIcdGetPlatformIDsKHR(cl_uint 0, cl_platform_id* 0, cl_uint* 0xbe9fdcc8)
[VC4CL]( hello.exe): API call: cl_int clIcdGetPlatformIDsKHR(cl_uint 1, cl_platform_id* 0x1e76b8, cl_uint* 0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x1ec5e4, cl_platform_info 2308, size_t 0, void* 0, size_t* 0xbe9fdc60)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x1ec5e4, cl_platform_info 2308, size_t 226, void* 0x1f8458, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x1ec5e4, cl_platform_info 2336, size_t 0, void* 0, size_t* 0xbe9fdc60)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x1ec5e4, cl_platform_info 2336, size_t 6, void* 0x1ec630, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clGetDeviceIDs(cl_platform_id 0x1ec5e4, cl_device_type 4, cl_uint 1, cl_device_id* 0xbe9fe5b4, cl_uint* 0)
[VC4CL]( hello.exe): API call: cl_context clCreateContext(const cl_context_properties* 0, cl_uint 1, const cl_device_id* 0xbe9fe5b4, void(CL_CALLBACK*)(const char* errinfo, const void* private_info, size_t cb, void* user_data) 0xbe9fdc98, void* 0, cl_int* 0xbe9fe640)
[VC4CL]( hello.exe): Tracking live-time of object: 0x1f787c (cl_context)
[VC4CL]( hello.exe): API call: cl_command_queue clCreateCommandQueue(cl_context 0x1f787c, cl_device_id 0x1ec5f8, cl_command_queue_properties 0, cl_int* 0xbe9fe640)
[VC4CL]( hello.exe): Starting queue handler thread...
[VC4CL]( hello.exe): Tracking live-time of object: 0x1f641c (cl_command_queue)
[VC4CL]( hello.exe): API call: cl_program clCreateProgramWithSource(cl_context 0x1f787c, cl_uint 1, const char** 0x22080, const size_t* 0, cl_int* 0xbe9fe640)
[VC4CL]( hello.exe): Tracking live-time of object: 0x1b9744 (cl_program)
[VC4CL]( hello.exe): API call: cl_int clBuildProgram(cl_program 0x1b9744, cl_uint 0, const cl_device_id* 0, const char* (null), void(CL_CALLBACK*)(cl_program program, void* user_data) 0xbe9fdd80, void* 0)
[VC4CL]( hello.exe): Precompiling source with:
Dumping program sources to /tmp/vc4cl-source-1365180540.cl
[VC4CL]( hello.exe): Dumping program IR to /tmp/vc4cl-ir-1540383426.ll
[VC4CL]( hello.exe): Precompilation complete with status: 0
[VC4CL]( hello.exe): [VC4CL] base=0x3fc00000, mem=0xb6fcf000
[VC4CL]( hello.exe): [VC4CL] V3D base: 0xb6fcf000
[VC4CL]( hello.exe): Compiling source with:
[VC4CL]( hello.exe): Compilation complete with status: 0
Dumping program binaries to /tmp/vc4cl-binary-304089172.bin
[VC4CL]( hello.exe): API call: cl_kernel clCreateKernel(cl_program 0x1b9744, const char* "square", cl_int* 0xbe9fe640)
[VC4CL]( hello.exe): Tracking live-time of object: 0x1ba324 (cl_kernel)
[VC4CL]( hello.exe): API call: cl_mem clCreateBuffer(cl_context 0x1f787c, cl_mem_flags 4, size_t 64, void* 0, cl_int* 0)
[VC4CL]( hello.exe): [VC4CL] Mailbox file descriptor opened: 4
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00030012 00000008 00000004 00000001 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00030012 00000008 80000004 80000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00010006 00000008 00000000 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00010006 00000008 80000008 38000000 08000000 00000000
[VC4CL]( hello.exe): Mailbox request: succeeded
[VC4CL]( hello.exe): Tracking live-time of object: 0x1ec69c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00000040 00001000 0000000c 00000000
[VC4CL]( hello.exe): ioctl_set_msg failed: -1
[VC4CL] Error in mbox_property: Connection timed out
terminate called after throwing an instance of 'std::system_error'
what(): Failed to set mailbox property: Connection timed out
terminate called recursively
- lsb_release
Distributor ID: Raspbian
Description: Raspbian GNU/Linux 10 (buster)
Release: 10
Codename: buster
- uname -a
Linux raspberrypi 5.4.70-v7+ #1 SMP Fri Oct 9 20:59:56 CST 2020 armv7l GNU/Linux
- clinfo
pi@raspberrypi:~ $ sudo clinfo
Number of platforms 1
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Platform Vendor doe300
Platform Version OpenCL 1.2 VC4CL 0.4.9999 (b7fef0a)
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_vc4cl_performance_counters
Platform Extensions function suffix VC4CL
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices 1
Device Name VideoCore IV GPU
Device Vendor Broadcom
Device Vendor ID 0x14e4
Device Version OpenCL 1.2 VC4CL 0.4.9999 (b7fef0a)
Driver Version 0.4.9999
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile EMBEDDED_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 1
Max clock frequency 300MHz
Core Temperature (Altera) 41 C
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 12x12x12
Max work group size 12
Preferred work group size multiple 1
Preferred / native vector sizes
char 16 / 16
short 16 / 16
int 16 / 16
long 0 / 0
half 0 / 0 (n/a)
float 16 / 16
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 134217728 (128MiB)
Error Correction support No
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 64 bytes
Alignment of base address 512 bits (64 bytes)
Global Memory cache type Read/Write
Global Memory cache size 32768 (32KiB)
Global Memory cache line size 64 bytes
Image support No
Local memory type Global
Local memory size 134217728 (128MiB)
Max number of constant args 32
Max constant buffer size 134217728 (128MiB)
Max size of kernel argument 256
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
IL version SPIR-V_1.5 SPIR_1.2
SPIR versions 1.2
printf() buffer size 0
Built-in kernels (n/a)
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_nv_pragma_unroll cl_arm_core_id cl_ext_atomic_counters_32 cl_khr_initialize_memory cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_int16 cl_arm_integer_dot_product_accumulate_saturate_int8 cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_vc4cl_performance_counters
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) OpenCL for the Raspberry Pi VideoCore IV GPU
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [VC4CL]
clCreateContext(NULL, ...) [default] Success [VC4CL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.12
I change the SIZE to 16
Which size did you set to 16?
Also, how long did you approximately wait before terminating the process?
I change the SIZE to 16
Which size did you set to 16?
#define DATA_SIZE (16)
Also, how long did you approximately wait before terminating the process?
it's about 1 minute, because it takes too long.
I've rebooted and re-run it, and I got execution failed
this time I didn't terminate it. and I just wait for execution finish.
[VC4CL]( hello.exe): API call: void* clGetExtensionFunctionAddressForPlatform(cl_platform_id 0x194b5e4, const char* "clIcdGetPlatformIDsKHR")
[VC4CL]( hello.exe): get extension function address: clIcdGetPlatformIDsKHR
[VC4CL]( hello.exe): API call: void* clGetExtensionFunctionAddressForPlatform(cl_platform_id 0x194b5e4, const char* "clGetPlatformInfo")
[VC4CL]( hello.exe): get extension function address: clGetPlatformInfo
[VC4CL]( hello.exe): API call: cl_int clIcdGetPlatformIDsKHR(cl_uint 0, cl_platform_id* 0, cl_uint* 0xbeb57c88)
[VC4CL]( hello.exe): API call: cl_int clIcdGetPlatformIDsKHR(cl_uint 1, cl_platform_id* 0x19466b8, cl_uint* 0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x194b5e4, cl_platform_info 2308, size_t 0, void* 0, size_t* 0xbeb57c20)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x194b5e4, cl_platform_info 2308, size_t 226, void* 0x1957458, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x194b5e4, cl_platform_info 2336, size_t 0, void* 0, size_t* 0xbeb57c20)
[VC4CL]( hello.exe): API call: cl_int clGetPlatformInfo(cl_platform_id 0x194b5e4, cl_platform_info 2336, size_t 6, void* 0x194b630, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clGetDeviceIDs(cl_platform_id 0x194b5e4, cl_device_type 4, cl_uint 1, cl_device_id* 0xbeb58574, cl_uint* 0)
[VC4CL]( hello.exe): API call: cl_context clCreateContext(const cl_context_properties* 0, cl_uint 1, const cl_device_id* 0xbeb58574, void(CL_CALLBACK*)(const char* errinfo, const void* private_info, size_t cb, void* user_data) 0xbeb57c58, void* 0, cl_int* 0xbeb58600)
[VC4CL]( hello.exe): Tracking live-time of object: 0x195687c (cl_context)
[VC4CL]( hello.exe): API call: cl_command_queue clCreateCommandQueue(cl_context 0x195687c, cl_device_id 0x194b5f8, cl_command_queue_properties 0, cl_int* 0xbeb58600)
[VC4CL]( hello.exe): Starting queue handler thread...
[VC4CL]( hello.exe): Tracking live-time of object: 0x195541c (cl_command_queue)
[VC4CL]( hello.exe): API call: cl_program clCreateProgramWithSource(cl_context 0x195687c, cl_uint 1, const char** 0x22080, const size_t* 0, cl_int* 0xbeb58600)
[VC4CL]( hello.exe): Tracking live-time of object: 0x1918744 (cl_program)
[VC4CL]( hello.exe): API call: cl_int clBuildProgram(cl_program 0x1918744, cl_uint 0, const cl_device_id* 0, const char* (null), void(CL_CALLBACK*)(cl_program program, void* user_data) 0xbeb57d40, void* 0)
[VC4CL]( hello.exe): Precompiling source with:
Dumping program sources to /tmp/vc4cl-source-1365180540.cl
[VC4CL]( hello.exe): Dumping program IR to /tmp/vc4cl-ir-1540383426.ll
[VC4CL]( hello.exe): Precompilation complete with status: 0
[VC4CL]( hello.exe): [VC4CL] base=0x3fc00000, mem=0xb6f85000
[VC4CL]( hello.exe): [VC4CL] V3D base: 0xb6f85000
[VC4CL]( hello.exe): Compiling source with:
[VC4CL]( hello.exe): Compilation complete with status: 0
Dumping program binaries to /tmp/vc4cl-binary-304089172.bin
[VC4CL]( hello.exe): API call: cl_kernel clCreateKernel(cl_program 0x1918744, const char* "square", cl_int* 0xbeb58600)
[VC4CL]( hello.exe): Tracking live-time of object: 0x1919324 (cl_kernel)
[VC4CL]( hello.exe): API call: cl_mem clCreateBuffer(cl_context 0x195687c, cl_mem_flags 4, size_t 64, void* 0, cl_int* 0)
[VC4CL]( hello.exe): [VC4CL] Mailbox file descriptor opened: 4
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00030012 00000008 00000004 00000001 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00030012 00000008 80000004 80000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00010006 00000008 00000000 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00010006 00000008 80000008 38000000 08000000 00000000
[VC4CL]( hello.exe): Mailbox request: succeeded
[VC4CL]( hello.exe): Tracking live-time of object: 0x194b69c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00000040 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000024 80000000 0003000c 0000000c 80000004 00000012 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000d 00000008 00000004 00000012 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000d 00000008 80000004 be6f9000 00000000 00000000
[VC4CL]( hello.exe): [VC4CL] base=0x3e6f9000, mem=0xb6f84000
[VC4CL]( hello.exe): Allocated 64 bytes of buffer: handle 18, device address 0xbe6f9000, host address 0xb6f84000
[VC4CL]( hello.exe): API call: cl_mem clCreateBuffer(cl_context 0x195687c, cl_mem_flags 2, size_t 64, void* 0, cl_int* 0)
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00010006 00000008 00000000 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00010006 00000008 80000008 38000000 08000000 00000000
[VC4CL]( hello.exe): Mailbox request: succeeded
[VC4CL]( hello.exe): Tracking live-time of object: 0x194b97c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00000040 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000024 80000000 0003000c 0000000c 80000004 00000014 00001000 0000000c 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000d 00000008 00000004 00000014 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000d 00000008 80000004 be6f8000 00000000 00000000
[VC4CL]( hello.exe): [VC4CL] base=0x3e6f8000, mem=0xb6f83000
[VC4CL]( hello.exe): Allocated 64 bytes of buffer: handle 20, device address 0xbe6f8000, host address 0xb6f83000
[VC4CL]( hello.exe): API call: cl_int clEnqueueWriteBuffer(cl_command_queue 0x195541c, cl_mem 0x194b69c, cl_bool 1, size_t 0, size_t 64, void* 0xbeb585c0, cl_uint 0, const cl_event* 0, cl_event* 0)
[VC4CL]( hello.exe): Tracking live-time of object: 0x19418cc (cl_event)
[VC4CL]( hello.exe): Releasing live-time of object: 0x19418cc (cl_event)
[VC4CL]( hello.exe): API call: cl_int clSetKernelArg(cl_kernel 0x1919324, cl_uint 0, size_t 4, const void* 0xbeb58570)
[VC4CL]( hello.exe): Set kernel arg 0 for kernel 'square' to 0xbeb58570 (26523292) with size 4
[VC4CL]( hello.exe): Kernel arg 0 for kernel 'square' is float* 'input' with size 4
[VC4CL]( hello.exe): Setting kernel-argument 0 to pointer 0x0x194b690
[VC4CL]( hello.exe): API call: cl_int clSetKernelArg(cl_kernel 0x1919324, cl_uint 1, size_t 4, const void* 0xbeb5856c)
[VC4CL]( hello.exe): Set kernel arg 1 for kernel 'square' to 0xbeb5856c (26524028) with size 4
[VC4CL]( hello.exe): Kernel arg 1 for kernel 'square' is float* 'output' with size 4
[VC4CL]( hello.exe): Setting kernel-argument 1 to pointer 0x0x194b970
[VC4CL]( hello.exe): API call: cl_int clSetKernelArg(cl_kernel 0x1919324, cl_uint 2, size_t 4, const void* 0xbeb58568)
[VC4CL]( hello.exe): Set kernel arg 2 for kernel 'square' to 0xbeb58568 (16) with size 4
[VC4CL]( hello.exe): Kernel arg 2 for kernel 'square' is uint 'count' with size 4
[VC4CL]( hello.exe): Setting kernel-argument 2 to scalar 16
[VC4CL]( hello.exe): API call: cl_int clGetKernelWorkGroupInfo(cl_kernel 0x1919324, cl_device_id 0x194b5f8, cl_kernel_work_group_info 4528, size_t 4, void* 0xbeb58578, size_t* 0)
[VC4CL]( hello.exe): API call: cl_int clEnqueueNDRangeKernel(cl_command_queue 0x195541c, cl_kernel 0x1919324, cl_uint 1, const size_t* 0, const size_t* 0xbeb5857c, const size_t* 0xbeb58578, cl_uint 0, const cl_event* 0, cl_event* 0)
[VC4CL]( hello.exe): Tracking live-time of object: 0x19418cc (cl_event)
[VC4CL]( hello.exe): API call: cl_int clFinish(cl_command_queue 0x195541c)
[VC4CL](VC4CL Queue Han): Running kernel 'square' with 341 instructions...
Local sizes: 4 1 1 -> 4 QPUs
Global sizes: 16 1 1 -> 4 work-groups (all at once)
[VC4CL](VC4CL Queue Han): Mailbox buffer before: 00000024 00000000 0003000c 0000000c 0000000c 00001000 00001000 0000000c 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer after: 00000024 80000000 0003000c 0000000c 80000004 00000013 00001000 0000000c 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer before: 00000020 00000000 0003000d 00000008 00000004 00000013 00000000 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer after: 00000020 80000000 0003000d 00000008 80000004 be6f4000 00000000 00000000
[VC4CL](VC4CL Queue Han): [VC4CL] base=0x3e6f4000, mem=0xb6f82000
[VC4CL](VC4CL Queue Han): Allocated 4096 bytes of buffer: handle 19, device address 0xbe6f4000, host address 0xb6f82000
[VC4CL](VC4CL Queue Han): Reserving space for 12 stack-frames of 0 bytes each
[VC4CL](VC4CL Queue Han): Copied 2728 bytes of kernel code to device buffer
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 0(16), 0(1), 0(1)
Local IDs (sizes): 0(4), 0(1), 0(1)
Group IDs (sizes): 0(4), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe6f9000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe6f8000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 16
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 1(16), 0(1), 0(1)
Local IDs (sizes): 1(4), 0(1), 0(1)
Group IDs (sizes): 0(4), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe6f9000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe6f8000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 16
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 2(16), 0(1), 0(1)
Local IDs (sizes): 2(4), 0(1), 0(1)
Group IDs (sizes): 0(4), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe6f9000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe6f8000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 16
[VC4CL](VC4CL Queue Han): Setting work-item infos:
1 dimensions with offsets: 0, 0, 0
Global IDs (sizes): 3(16), 0(1), 0(1)
Local IDs (sizes): 3(4), 0(1), 0(1)
Group IDs (sizes): 0(4), 0(1), 0(1)
[VC4CL](VC4CL Queue Han): Setting parameter 7 to buffer 0xbe6f9000
[VC4CL](VC4CL Queue Han): Setting parameter 8 to buffer 0xbe6f8000
[VC4CL](VC4CL Queue Han): Setting parameter 9 to scalar 16
[VC4CL](VC4CL Queue Han): 10 parameters set.
[VC4CL](VC4CL Queue Han): Dumping kernel buffer to /tmp/vc4cl-dump-square-1303455736.bin
[VC4CL](VC4CL Queue Han): Running work-group 0, 0, 0
[VC4CL](VC4CL Queue Han): Execution: failed
[VC4CL](VC4CL Queue Han): [VC4CL](VC4CL Queue Han): Mailbox buffer before: 00000020 00000000 0003000e 00000008 00000004 00000013 00000000 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer after: 00000020 80000000 0003000e 00000008 80000004 00000000 00000000 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer before: 00000020 00000000 0003000f 00000008 00000004 00000013 00000000 00000000
[VC4CL](VC4CL Queue Han): Mailbox buffer after: 00000020 80000000 0003000f 00000008 80000004 00000000 00000000 00000000
[VC4CL](VC4CL Queue Han): Deallocated 4096 bytes of buffer: handle 19, device address 0xbe6f4000, host address 0xb6f82000
[VC4CL]( hello.exe): Error in '/home/pi/opencl/VC4CL/src/CommandQueue.cpp:113', returning status -5
[VC4CL]( hello.exe): Releasing live-time of object: 0x19418cc (cl_event)
[VC4CL]( hello.exe): API call: cl_int clEnqueueReadBuffer(cl_command_queue 0x195541c, cl_mem 0x194b97c, cl_bool 1, size_t 0, size_t 64, void* 0xbeb58580, cl_uint 0, const cl_event* 0, cl_event* 0)
[VC4CL]( hello.exe): Tracking live-time of object: 0x19418cc (cl_event)
[VC4CL]( hello.exe): Releasing live-time of object: 0x19418cc (cl_event)
Computed '0/16' correct values!
[VC4CL]( hello.exe): API call: cl_int clReleaseMemObject(cl_mem 0x194b69c)
[VC4CL]( hello.exe): Releasing live-time of object: 0x194b69c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000e 00000008 00000004 00000012 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000e 00000008 80000004 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000f 00000008 00000004 00000012 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000f 00000008 80000004 00000000 00000000 00000000
[VC4CL]( hello.exe): Deallocated 64 bytes of buffer: handle 18, device address 0xbe6f9000, host address 0xb6f84000
[VC4CL]( hello.exe): API call: cl_int clReleaseMemObject(cl_mem 0x194b97c)
[VC4CL]( hello.exe): Releasing live-time of object: 0x194b97c (cl_mem)
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000e 00000008 00000004 00000014 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000e 00000008 80000004 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 0003000f 00000008 00000004 00000014 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 0003000f 00000008 80000004 00000000 00000000 00000000
[VC4CL]( hello.exe): Deallocated 64 bytes of buffer: handle 20, device address 0xbe6f8000, host address 0xb6f83000
[VC4CL]( hello.exe): API call: cl_int clReleaseProgram(cl_program 0x1918744)
[VC4CL]( hello.exe): API call: cl_int clReleaseKernel(cl_kernel 0x1919324)
[VC4CL]( hello.exe): Releasing live-time of object: 0x1919324 (cl_kernel)
[VC4CL]( hello.exe): Releasing live-time of object: 0x1918744 (cl_program)
[VC4CL]( hello.exe): API call: cl_int clReleaseCommandQueue(cl_command_queue 0x195541c)
[VC4CL]( hello.exe): Releasing live-time of object: 0x195541c (cl_command_queue)
[VC4CL]( hello.exe): API call: cl_int clReleaseContext(cl_context 0x195687c)
[VC4CL]( hello.exe): Releasing live-time of object: 0x195687c (cl_context)
[VC4CL]( hello.exe): Mailbox buffer before: 00000020 00000000 00030012 00000008 00000004 00000000 00000000 00000000
[VC4CL]( hello.exe): Mailbox buffer after: 00000020 80000000 00030012 00000008 80000004 80000000 00000000 00000000
[VC4CL]( hello.exe): [VC4CL] Mailbox file descriptor closed: 4
[VC4CL]( hello.exe): Stopping queue handler thread...
[VC4CL](VC4CL Queue Han): Queue handler thread stopped
It looks like the compiler generates invalid code here which hangs/runs in some infinte loop on the VC4 hardware. I will have to check the generated code to see, what goes wrong there.
I've upload compressed file to mega,
in which includes generated files (ll, cl, bin, dot) comes from /tmp/vc4*
YES, You got me, thanks
I realized that my raspberry pi was configured with raspi-config to use FKMS (Fake-KMS) for Gaming.
After I switch it to legacy one (without GL), this problem won't occurred.
I'll close this issue