
OpenCL memory abstraction library for memory transfer optimization

Primary LanguageCMIT LicenseMIT


dlmCl is C++ host-side library for OpenCL designed to optimize CPU-GPU data transfers by utilizing features of modern hardware memory architectures.
Its API provides methods for cross-platform runtime architecture detection and dedicated memory abstraction.
dlmCl is a support library rather than a complete computing framework, therefore it have low-level primitives that should be used in conjunction with raw OpenCL API.


  • C++11 compiler
  • OpenCL 1.2 SDK
  • CMake 2.6


mkdir build && cd ./build
cmake ../ && make

Experimental results

perfromance counters

  • Grammian matrix (computation intensive task)
    • small task size - x1.3 boost
    • large task sizes - ~x1.0
  • Element-wise array processing (data-intensive task)
    • uniform x1.7 boost


// OpenCL objects and kernel params
size_t loc, n;
cl_device_id cl_device;
cl_command_queue queue;
cl_kernel kernel;

// initialize device & memory object
dlmcl::Device dev(cl_device);
dlmcl::Memobj mem = dlmcl::Memobj::getOptimal(dev, n, CL_MEM_READ_WRITE);

// gather output data
void* mem_ptr = mem->getHostMemory();

// run kernel
cl_mem mem_dev = src->getDeviceMemory();
clSetKernelArg(kernel, 0, sizeof mem_dev, &mem_dev);
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &n, loc, 0, NULL, NULL);

// gather output data
void* ptr = mem->getHostMemory();