juglab/labkit-pixel-classification

Compatibility with Intel GPUs

haesleinhuepf opened this issue · 4 comments

Hi @maarzt ,

I'm just testing on Windows 10 / Intel UHD 620 and found some issues in this benchmark:
https://github.com/maarzt/imglib2-trainable-segmentation/blob/clij-benchmark/src/test/java/net/imglib2/trainable_segmention/performance/GpuCpuComparisonBenchmark.java

It crashes on my GPU with this error:

2:570:35: error: access qualifier can only be used for pipe and image type
__kernel void separable_operation(OUTPUT_IMAGE_PARAMETER, INPUT_IMAGE_PARAMETER, __constant float* kernelValues)
                                  ^
2:518:33: note: expanded from macro 'OUTPUT_IMAGE_PARAMETER'
#define OUTPUT_IMAGE_PARAMETER  __write_only __global float* output
                                ^
2:570:59: error: access qualifier can only be used for pipe and image type
__kernel void separable_operation(OUTPUT_IMAGE_PARAMETER, INPUT_IMAGE_PARAMETER, __constant float* kernelValues)
                                                          ^
2:555:32: note: expanded from macro 'INPUT_IMAGE_PARAMETER'
#define INPUT_IMAGE_PARAMETER   __read_only __global float* input
                                ^
Error when trying to create kernel separable_operation
net.haesleinhuepf.clij.clearcl.exceptions.OpenCLException: OpenCL error: -45 -> CL_INVALID_PROGRAM_EXECUTABLE
	at net.haesleinhuepf.clij.clearcl.backend.BackendUtils.checkOpenCLErrorCode(BackendUtils.java:352)
	at net.haesleinhuepf.clij.clearcl.backend.jocl.ClearCLBackendJOCL.lambda$getKernelPeerPointer$19(ClearCLBackendJOCL.java:601)
	at net.haesleinhuepf.clij.clearcl.backend.BackendUtils.checkExceptions(BackendUtils.java:156)
	at net.haesleinhuepf.clij.clearcl.backend.jocl.ClearCLBackendJOCL.getKernelPeerPointer(ClearCLBackendJOCL.java:593)
	at net.haesleinhuepf.clij.clearcl.ClearCLCompiledProgram.createKernel(ClearCLCompiledProgram.java:137)
	at net.haesleinhuepf.clij.clearcl.ClearCLProgram.createKernel(ClearCLProgram.java:685)
	at net.haesleinhuepf.clij.clearcl.util.CLKernelExecutor.getKernel(CLKernelExecutor.java:353)
	at net.haesleinhuepf.clij.clearcl.util.CLKernelExecutor.enqueue(CLKernelExecutor.java:229)
	at net.haesleinhuepf.clij2.CLIJ2.lambda$executeSubsequently$0(CLIJ2.java:236)
	at net.haesleinhuepf.clij.clearcl.util.ElapsedTime.measure(ElapsedTime.java:97)
	at net.haesleinhuepf.clij.clearcl.util.ElapsedTime.measure(ElapsedTime.java:28)
	at net.haesleinhuepf.clij2.CLIJ2.executeSubsequently(CLIJ2.java:227)
	at net.imglib2.trainable_segmention.gpu.api.DefaultGpuApi.lambda$execute$1(DefaultGpuApi.java:57)
	at net.imglib2.trainable_segmention.gpu.api.DefaultGpuApi.handleOutOfMemoryException(DefaultGpuApi.java:65)
	at net.imglib2.trainable_segmention.gpu.api.DefaultGpuApi.execute(DefaultGpuApi.java:56)
	at net.imglib2.trainable_segmention.gpu.api.GpuScope.execute(GpuScope.java:42)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuSeparableOperation.run(GpuSeparableOperation.java:76)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuKernelConvolution.convolve(GpuKernelConvolution.java:61)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuKernelConvolution.apply(GpuKernelConvolution.java:47)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuConcatenatedNeighborhoodOperation.apply(GpuConcatenatedNeighborhoodOperation.java:49)
	at net.imglib2.trainable_segmention.gpu.compute_cache.GpuGaussContent.load(GpuGaussContent.java:54)
	at net.imglib2.trainable_segmention.gpu.compute_cache.GpuComputeCache$CacheEntry.get(GpuComputeCache.java:90)
	at net.imglib2.trainable_segmention.gpu.compute_cache.GpuComputeCache.get(GpuComputeCache.java:51)
	at net.imglib2.trainable_segmention.gpu.GpuFeatureInput.gauss(GpuFeatureInput.java:41)
	at net.imglib2.trainable_segmention.pixel_feature.filter.gauss.SingleGaussianBlurFeature.apply(SingleGaussianBlurFeature.java:51)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.lambda$apply$2(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.genericApply(FeatureJoiner.java:71)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.apply(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.filter.AbstractGroupFeatureOp.apply(AbstractGroupFeatureOp.java:57)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.lambda$apply$2(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.genericApply(FeatureJoiner.java:71)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.apply(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.calculator.FeatureCalculator.applyUseGpu(FeatureCalculator.java:152)
	at net.imglib2.trainable_segmention.classification.Segmenter.segmentGpu(Segmenter.java:129)
	at net.imglib2.trainable_segmention.classification.Segmenter.segment(Segmenter.java:109)
	at net.imglib2.trainable_segmention.performance.ParallelSegmentationTask.lambda$run$0(ParallelSegmentationTask.java:44)
	at net.imglib2.cache.img.LoadedCellCacheLoader.get(LoadedCellCacheLoader.java:91)
	at net.imglib2.cache.img.LoadedCellCacheLoader.get(LoadedCellCacheLoader.java:51)
	at net.imglib2.cache.ref.SoftRefLoaderCache.get(SoftRefLoaderCache.java:101)
	at net.imglib2.cache.util.LoaderCacheAsCacheAdapter.get(LoaderCacheAsCacheAdapter.java:30)
	at net.imglib2.cache.util.CacheAsUncheckedCacheAdapter.get(CacheAsUncheckedCacheAdapter.java:28)
	at net.imglib2.img.cell.LazyCellImg$LazyCells.get(LazyCellImg.java:104)
	at net.imglib2.img.list.AbstractLongListImg$LongListRandomAccess.get(AbstractLongListImg.java:274)
	at net.imglib2.img.cell.CellRandomAccess.getCell(CellRandomAccess.java:136)
	at net.imglib2.img.cell.CellRandomAccess.updatePosition(CellRandomAccess.java:474)
	at net.imglib2.img.cell.CellRandomAccess.<init>(CellRandomAccess.java:130)
	at net.imglib2.img.cell.AbstractCellImg.randomAccess(AbstractCellImg.java:104)
	at net.imglib2.trainable_segmention.Utils.lambda$populateCellImg$7(Utils.java:355)
	at preview.net.imglib2.parallel.DefaultTaskExecutor.lambda$forEach$0(DefaultTaskExecutor.java:101)
	at java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1424)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

When removing the __write_only and __read_only from here execution of the benchmark hangs here for quite some time (minutes):

net.imglib2.trainable_segmention.performance.GpuCpuComparisonBenchmark
[WARNING] 1 exceptions occurred during plugin discovery.
log4j:WARN No appenders could be found for logger (org.bushe.swing.event.EventService).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
GPU: Intel(R) UHD Graphics 620
Time copying: 107.304 ms

Edit: It's not "hanging" - it just takes a minute or so before it continues...

Note, I had to make the OPENCL_DEVICE_NAME configurable in order to test this. It might make sense to make this available to end users as well:
https://github.com/haesleinhuepf/imglib2-trainable-segmentation/commit/d32ab1af07d0bd48ab9bcac28736f55231b3f91a

Let me know if I can help with further infos.

Cheers,
Robert

THANK YOU for testing this!

  • I removed to __read_only and __write_only flags as you suggested.
  • The "hanging" occurs for me too, some times, not a minute but still noticeable. So I will try to fix that first, and than I will ask you if the problem still occurs.
  • Making the device name configurable would be great. Ideally it should be configurable in the UI. I'm undecided: What's the best place to do such Settings in ImageJ / Labkit?
  • Making the device name configurable would be great. Ideally it should be configurable in the UI. I'm undecided: What's the best place to do such Settings in ImageJ / Labkit?

Does Labkit have its own config dialog? It would fit there, right? Next to the "Use GPU" checkbox ;-)

No Labkit doesn't have a config dialog yet. Is there an "ImageJ / scijava way" for config dialogs?

@maarzt wrote:

Is there an "ImageJ / scijava way" for config dialogs?

Yes, there are OptionsPlugins. See this one in TrackMate for a simple example.