current example of R2R FFT?
tlambert03 opened this issue · 7 comments
From what I understand, FFT
only does C2C... which is fine. I'm looking to create a function that mimics the api of scipy.fft.rfft in terms of inputs and outputs (even if the actual FFT operates on complex). I found this example transformation from 2013: #9 (comment) ... and can work off that, but I'm wondering if you have any more recent tips or examples on how you'd accomplish that?
thanks!
edit: I suspect I can use split_complex
transformation ... but still learning the patterns.
This example should work: https://github.com/fjarri/reikna/blob/develop/examples/demo_real_to_complex_fft.py . It only describes how to make real-to-complex FFT, but making real-to-real by analogy should be relatively straightforward (if it isn't, do comment here - I know Reikna's API may be unfriendly sometimes).
As for split_complex
, it does not discard the complex part; you would want to use it if you wanted to switch from interleaved representation to keeping real and complex parts in separate arrays. I think the cast
transformation should work here, both ways (real to complex and complex to real)
Thanks for the prompt reply @fjarri! much appreciated...
I'm still struggling to connect the parameters correctly (sorry :/). here's what I've tried:
this works great for R2C, as shown in the example:
def tform_r2c(arr):
complex_dtype = cluda.dtypes.complex_for(arr.dtype)
return Transformation(
[
Parameter("output", Annotation(Type(complex_dtype, arr.shape), "o")),
Parameter("input", Annotation(arr, "i")),
],
"""
${output.store_same}(
COMPLEX_CTR(${output.ctype})(
${input.load_same},
0));
""",
)
r2c = tform_r2c(arr) # arr is float32
plan = FFT(r2c.output, axes=axes)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
# good to go
btw, if I try to use cast
instead of tform_r2c
above, I get a stride mismatch (I suspect i haven't done it correctly):
r2c = cast(arr, cluda.dtypes.complex_for(arr.dtype))
plan = FFT(r2c.output, axes=axes)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation.py:477: in <listcomp>
self._get_kernel_argobject(name, self.root_parameters[name].annotation)
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation.py:463: in _get_kernel_argobject
store_idx, store_same, store_combined_idx = self._get_connection_modules(
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation.py:440: in _get_connection_modules
m_idx = module_leaf_macro(output, param)
../../../miniconda3/envs/clij/lib/python3.8/site-packages/reikna/core/transformation_modules.py:162: in module_leaf_macro
index_expr=flat_index_expr(param)))
E ValueError: Some of the strides (4096, 4)are not multiples of the itemsize8
On the output, I tried this C2R transform:
def tform_c2r(arr):
"""Transform a complex array to a real one by discarding the imaginary part."""
real_dtype = cluda.dtypes.real_for(arr.dtype)
return Transformation(
[
Parameter("output", Annotation(Type(real_dtype, arr.shape), "o")),
Parameter("input", Annotation(arr, "i")),
],
"""
${output.store_same}(${input.load_same}.x);
""",
)
and I suspect I'm not connecting it right:
r2c = tform_r2c(arr)
plan = FFT(r2c.output, axes=axes)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
c2r = tform_c2r(plan.parameter.output)
plan.parameter.output.connect(c2r, c2r.output, new_output=c2r.input)
ValueError: Incompatible types of the transformation parameter 'output' (Type(float32, shape=(128, 128))) and the node 'output' (Type(complex64, shape=(128, 128)))
(intuitively it feels like I've got the output connection backwards... but I can't figure out the syntax to fix it)
Thanks for trying out cast()
, I think you discovered a bug there. Or, rather, there's a bug in Type.with_dtype()
, which does not change the strides when the itemsize changes. By the way, sometimes during a debug it may be convenient to print out the .signature
property of a transformation or a computation, to see what they actually expect as arguments.
You were on the right track with the manual transformation, except you got the connection bit wrong. It should be:
plan.parameter.output.connect(c2r, c2r.input, new_output=c2r.output)
That is, you're connecting c2r.input
to plan.output
.
thanks a lot! got it working... full R2R example in details below for anyone who stumbles on this:
from reikna.transformations import Annotation, Parameter, Transformation
import reikna.cluda as cluda
from reikna.core import Type
from reikna.fft import FFT
import numpy as np
api = cluda.ocl_api()
thr = api.Thread.create()
def tform_r2c(arr):
complex_dtype = cluda.dtypes.complex_for(arr.dtype)
return Transformation(
[
Parameter("output", Annotation(Type(complex_dtype, arr.shape), "o")),
Parameter("input", Annotation(arr, "i")),
],
"""
${output.store_same}(
COMPLEX_CTR(${output.ctype})(
${input.load_same},
0));
""",
)
def tform_c2r(arr):
"""Transform a complex array to a real one by discarding the imaginary part."""
real_dtype = cluda.dtypes.real_for(arr.dtype)
return Transformation(
[
Parameter("output", Annotation(Type(real_dtype, arr.shape), "o")),
Parameter("input", Annotation(arr, "i")),
],
"""
${output.store_same}(${input.load_same}.x);
""",
)
arr = np.random.rand(8).astype(np.float32)
r2c = tform_r2c(arr)
plan = FFT(r2c.output)
plan.parameter.input.connect(r2c, r2c.output, new_input=r2c.input)
c2r = tform_c2r(plan.parameter.output)
plan.parameter.output.connect(c2r, c2r.input, new_output=c2r.output)
planc = plan.compile(thread=thr)
arr_dev = thr.to_device(arr)
out_dev = thr.array(arr.shape, np.float32)
planc(out_dev, arr_dev)
assert np.allclose(out_dev.get(), np.fft.fft(arr).real)
<\details>
I'll close this issue ... but if I may ask you one more question:
as far as you know, are there any other python opencl FFT implementations out there? I have everything i need in reikna, so I'm not looking for an alternative... just want to confirm my perception that reikna's OCL implementation is still the "go-to" source for a GPU FFT without using CUDA)
The only one I know of is https://github.com/geggo/gpyfft, which is based on https://github.com/clMathLibraries/clFFT. I have not used it myself though, so I don't know how the performance compares (but I wouldn't be surprised if it's better).
thanks again for your help