OpenCL bindings for Chicken Scheme
These bindings target OpenCL version 1.1 for wide support. It should be fairly easy to add support to newer OpenCL APIs.
Hosted here.
Most Scheme procedures map 1-1 to OpenCL C procedures, with some
naming conventions changed. All procedures start with the type of
object they work on. For example, clCreateBuffer
becomes
buffer-create
.
All OpenCL objects are wrapped in a record structure. All finalizer
keyword arguments default to using set-finalizer!
to call the
appropriate *-release!
procedure when the object is freed by the
CHICKEN GC. This automatic memory management should be sufficient for
many use cases.
For a large number of OpenCL objects it may be a good idea to call the
*-release!
procedures explicitly. All the *-release!
procedures
are idempotent, so it is even safe to use them on objects that have
finalizers attached. Also, (gc #t)
can sometimes be used to free
ObjecCL objects no longer needed by the application, freeing OpenCL
resources sooner.
If any of the OpenCL C procedures return an error code other than
CL_SUCCESS
, an error is signaled, hopefully with a meaningful error
message.
You must have a working OpenCL installed on your system for this egg
to work. clinfo
is a useful tool to diagnose and verify
this. Environment variables like OCL_ICD_VENDORS
may be useful to
out-source the device-picking process:
~> csi -R opencl -P '(platforms)'
(#<cl_platform "Intel(R) OpenCL HD Graphics">
#<cl_platform "Clover">
#<cl_platform "Portable Computing Language">)
~> env OCL_ICD_VENDORS=intel.icd csi -R opencl -P '(platforms)'
(#<cl_platform "Intel(R) OpenCL HD Graphics">)
There is unfortunately a bit boilerplate to get started, but luckily it's less than the equivalent C code. This code snippet might be useful:
(import srfi-4 opencl)
(define device (car (flatten (map platform-devices (platforms)))))
(define context (context-create device))
(define cq (command-queue-create context device))
(define buffer (buffer-create (f32vector -1 -1 -1 -1 -1) cq))
(define kernel (kernel-create (program-build (program-create context "
__kernel void foo(__global float *out) {
out[get_global_id(0)] = get_global_id(0) * 10;
}
") device) "foo"))
(kernel-arg-set! kernel 0 buffer)
(kernel-enqueue kernel cq (list 4))
(print (buffer-read buffer cq))
When run, it should output #f32(0.0 10.0 20.0 30.0 -1.0)
. The last
-1
is leftovers from buffer creation. More examples can be found in
the examples folder.
[procedure] (platforms) => (list-of cl_platform)
List all the OpenCL platforms on the system.
[procedure] (platform-extensions platform) => string
[procedure] (platform-name platform) => string
[procedure] (platform-profile platform) => string
[procedure] (platform-vendor platform) => string
[procedure] (platform-version platform) => string
Get various information for a given platform. See clGetPlatformIDs
[procedure] (platform-devices platform #!optional device-type) => (list-of cl_device)
List the available devices for platform
. device-type
can
optionally be supplied as a filter, and symbol among cpu
gpu
all
accelerator
default
.
[procedure] (device-info device) => alist
[procedure] (device-address-bits device) => number
[procedure] (device-name device) => string
[procedure] (device-vendor device) => string
[procedure] (device-version device) => string
Get various information of device
. See this
page.
[procedure] (context-create devices #!key (finalizer #t)) => cl_context
[procedure] (context-release!) => void
Create or release an OpenCL context. See this
page. Unfurtunately,
currently devices
must be a single cl_device
. In the future, a
list of cl_device
s might be implemented.
[procedure] (command-queue-create context device #!key out-of-order profile (finalizer #t)) => cl_command_queue
[procedure] (command-queue-release! command_queue) => void
Create or release a cl_command-queue
. out-of-order
is a boolean
flag and is #f
by default. See this
page.
[procedure] (command-queue-context context) => cl_context
Get back the cl_context
passed into command-queue-create
.
[procedure] (buffer-create s c #!key (type #f) (flags 0) (finalizer #t)) => cl_mem
[procedure] (buffer-release! buffer) => void
Create or release an OpenCL buffer object, a memory chunk that is usually stored on the target device. See this page.
If c
is a cl_context
, s
(s for source) must be an integer
representing the buffer size in bytes.
If c
is a cl_command-queue
, s
must be a srfi-4 vector or a
blob. The content of s
is copied to the newly created buffer
buffer
using buffer-write
. The size in bytes of this buffer and
the size in bytes of s
will be equal.
type
defaults to the type of s
for srfi-4 vectors, or blob
if
s
is an integer. If s
is an integer, type
can be explicitly
supplied. See buffer-type
for valid values.
Support for flags
is currently limited, and must be a number
corresponding to the C API. The default is CL_MEM_READ_WRITE
.
[procedure] (buffer-write buffer cq src #!key (offset 0)) => cl_mem
Copy the content of src
to buffer
. See this
page. Only
blocking writes are supported.
cq
must be a valid cl_command-queue
. src
must be a srfi-4 vector
or a blob. offset
is the offset in bytes in the buffer object to
write to.
[procedure] (buffer-read buffer cq #!key type (dst #f) (byte-offset 0) (bytes #f)) => srfi-4-vector
Copy the content of buffer
into host memory. The returned object is
dst
if supplied, otherwise a newly allocated srfi-4 vector or
blob. Which vector type depends on type
, which defaults to
(buffer-type buffer)
. If supplied, dst
must be a srfi-4 vector or
blob and its content will be overwritten with data from buffer
.
If supplied, bytes
must be the number of bytes to copy. If supplied,
byte-offset
must be the number of bytes to skip in buffer
.
[procedure] (buffer-size buffer) => integer
Return the size of buffer
in bytes.
[procedure] (buffer-type buffer) => symbol
Return the type tag symbol. This is stored in the cl_mem
Scheme
record, not in the OpenCL cl_mem
object. It is used by buffer-read
to construct a suitable srfi-4 vector. Valid types are u8
s8
u16
s16
u32
s32
u64
s64
f32
f64
.
[procedure] (program-create context source #!key (finalizer #t)) => cl_program
[procedure] (program-release! program) => void
Create or release a OpenCL with clCreateProgramWithSource.
context
must be a valid cl_context
. source
must be OpenCL C
source code as a string. Note that the returned program
must be
built before it can be used to create kernels.
[procedure] (program-build program devices #!key options) => program
devices
must be a valid cl_device
. Unfortunately, only a single
device is currently supported. If the build process fails, an error is
signaled with the content of (program-build-log program)
which will
hopefully contain useful compiler errors.
If supplied, options
must be a string as specified in
clBuildProgram.
[procedure] (program-build-log program) => string
[procedure] (program-build-options program) => string
Retrieve various information about program
, see this
page.
[procedure] (kernel-create program name #!key (finalizer #t)) => cl_kernel
[procedure] (kernel-release! kernel) => void
Create or release a cl_kernel
. See this page.
name
must be a string, the name of a __kernel
function inside
program
.
[procedure] (kernel-arg-set! kernel idx value) => void
Calls
clSetKernelArg
. idx
must be an integer, where 0 is the first kernel argument. value
must
be a srfi-4 vector for by-value argument types, or cl_buffer
for
arguments pointers.
The type of the kernel argument (ie. float2
) must match the srfi-4
vector type (ie. (f32vector 1 1)
). It is unfortunately not possible
to enforce checks for this. The srfi-4 vector's length must match the
kernel argument. For example, the arguments to __kernel foo(long4 a, float *result)
could be initialized like this:
(kernel-arg-set! kernel 0 (s64vector 1 2 3 4))
(kernel-arg-set! kernel 1 (buffer-create (* 4 1000) ctx type: 'f32))
I don't know if you must call this for all arguments each time.
[procedure] (kernel-enqueue kernel cq global-work-sizes #!key event wait global-work-offsets local-work-sizes) => (or cl_event #f)
Enqueues kernel
for execution on cq
as described
here.
global-work-sizes
must be a list of integers, each representing a
work size for each dimension. On most platforms, the maximum number of
dimensions (and thus the length of global-work-sizes
) is 3. The
minimum is always 1.
The numbers supplied here will be reflected inside kernel
as
get_global_size(n)
. Note that total number of worker-items executed should be (foldl * 1 global-work-sizes)
.
Unlike most of the other procedures, kernel-enqueue
is non-blocking
and may return before kernel
is finished executing. If event
is
supplied and is #t
or a cl_event
, the returned event
can be used
to query the kernel execution status. If event
is not supplied,
kernel-enqueue
returns #f
.
The event API is a bit experimental and difficult to test.
[procedure] (event-create context) => cl_event
[procedure] (event-release!) => void
Create or release a user-event cl_event
, see this
page.
[procedure] (event-complete! event #!optional (status 'complete)) => cl_event
Set the execution status of event
, see
clSetUserEventStatus.
[procedure] (event-status event) => symbol
Retrieve event
status, one of: queued
submitted
running
complete
.
[procedure] (cl_command-queue? x) => bool
[procedure] (cl_context? x) => bool
[procedure] (cl_device? x) => bool
[procedure] (cl_event? x) => bool
[procedure] (cl_kernel? x) => bool
[procedure] (cl_mem? x) => bool
[procedure] (cl_platform? x) => bool
[procedure] (cl_program? x) => bool
Predicates for record types. buffer
object are cl_mem?
records.
- automatic memory management of OpenCL objects
- list platforms and devices
- create context and command queue
- create, compile and run programs from source-code
- create, compile and run programs from IR/Binary
- build options (in
clBuildProgram
) - launch kernels (global-size, local-size and global-offset)
- kernel name
-
clCreateKernelsInProgram
- create, read and write buffer objects
- set read-only and write-only buffer flags
- create, read and write image objects
- copy buffers
- copy images
- samplers
- sub-buffers
- non-blocking API
- event support for kernel and user (test fails on my Clover platform)
- event support for buffer commands
- event callbacks
- event blocking (
clWaitForEvents
) - OpenCL <-> OpenGL interop
So far everything is blocking. Implementing the non-blocking API would complicate things as the Chicken GC might get in the way.