Complex function names are no longer being transformed to the expected c names.
dham opened this issue · 8 comments
This commit: 64a8a8c
makes the following change:
if dtype.kind == "c" or name in ["real", "imag", "abs"]:
becomes:
if dtype.kind == "c":
The effect of this is that real(foo)
is no longer transformed to creal(foo)
in the generated code, which breaks Firedrake.
Hi David!
I just checked the generated code for loopy@main and it looks like it does transform it to creal
, etc.
(py311_env) $ pip freeze | grep loopy
-e git+https://github.com/inducer/loopy.git@1506d5476f08a5e8ff0375c60f0e2f463e627bb0#egg=loopy
(py311_env) $ cat hello.py
import loopy as lp
import numpy as np
knl = lp.make_kernel(
"{[i]: 0<=i<10}",
"""
y1[i] = abs(x32[i])
y2[i] = real(x32[i])
y3[i] = imag(x32[i])
y4[i] = conj(x32[i])
y5[i] = abs(x64[i])
y6[i] = real(x64[i])
y7[i] = imag(x64[i])
y8[i] = conj(x64[i])
""",
target=lp.CTarget(),
seq_dependencies=True)
knl = lp.add_dtypes(knl, {"x32": np.complex64, "x64": np.complex128})
print(lp.generate_code_v2(knl).device_code())
(py311_env) $ python hello.py
#include <complex.h>
#include <stdint.h>
#include <stdbool.h>
void loopy_kernel(float complex const *__restrict__ x32, double complex const *__restrict__ x64, float complex *__restrict__ y1, float complex *__restrict__ y2, float complex *__restrict__ y3, float complex *__restrict__ y4, double *__restrict__ y5, double *__restrict__ y6, double *__restrict__ y7, double complex *__restrict__ y8)
{
for (int32_t i = 0; i <= 9; ++i)
{
y1[i] = cabsf(x32[i]);
y2[i] = crealf(x32[i]);
y3[i] = cimagf(x32[i]);
y4[i] = cconjf(x32[i]);
y5[i] = cabs(x64[i]);
y6[i] = creal(x64[i]);
y7[i] = cimag(x64[i]);
y8[i] = conj(x64[i]);
}
}
OK so somehow in Firedrake that transformation isn't happening and we get linker fails because real
is an unknown system. Where in loopy does the c
now get added? (I will have to go through and try to work out why it's not happening for us).
The "c" gets prefixed in the code you pointed at:
loopy/loopy/target/c/__init__.py
Lines 550 to 552 in 1506d54
Actually, I can reproduce the issue. If you change the types line to:
knl = lp.add_dtypes(knl, {"x32": np.float32, "x64": np.float64})
then the c
is not added. This is a corner case where we're generating code which applies those functions to real values. That's why that special case was added.
Hi David,
Even in the case of real-valued arguments, we don't need "cabs", we just need call to fabs and cast to double complex
. But the generated code doesn't include "math.h" which causes a bunch of compiler warnings. (seeing these compiler warnings is definitely a loopy-bug and I can patch this so that loopy emits include <math.h>
for these cases)
A reproducer to show that except the includes, everything looks generally correct to me:
(py311_env) $ cat hello.py
import loopy as lp
import numpy as np
knl = lp.make_kernel(
"{[i]: 0<=i<5}",
"""
y1[i] = abs(x64[i])
""",
target=lp.ExecutableCTarget(),
lang_version=(2018, 2))
knl = lp.add_dtypes(knl, {"x64": np.float64,
"y1": np.complex128,})
print(lp.generate_code_v2(knl).device_code())
x_ary = np.random.rand(5)
_, (y1_ary,) = knl(x64=x_ary)
print(x_ary)
print(y1_ary)
(py311_env) $ python hello.py
#include <complex.h>
#include <stdint.h>
#include <stdbool.h>
void loopy_kernel(double const *__restrict__ x64, double complex *__restrict__ y1)
{
for (int32_t i = 0; i <= 4; ++i)
y1[i] = (double complex) (fabs(x64[i]));
}
/tmp/tmp_loopyx04yoznw/code.c: In function ‘loopy_kernel’:
/tmp/tmp_loopyx04yoznw/code.c:8:31: warning: implicit declaration of function ‘fabs’ [-Wimplicit-function-declaration]
8 | y1[i] = (double complex) (fabs(x64[i]));
| ^~~~
/tmp/tmp_loopyx04yoznw/code.c:4:1: note: include ‘<math.h>’ or provide a declaration of ‘fabs’
3 | #include <stdbool.h>
+++ |+#include <math.h>
4 |
/tmp/tmp_loopyx04yoznw/code.c:8:31: warning: incompatible implicit declaration of built-in function ‘fabs’ [-Wbuiltin-declaration-mismatch]
8 | y1[i] = (double complex) (fabs(x64[i]));
| ^~~~
/tmp/tmp_loopyx04yoznw/code.c:8:31: note: include ‘<math.h>’ or provide a declaration of ‘fabs’
[0.40470538 0.73333579 0.25243711 0.94475841 0.78098459]
[0.40470538+0.j 0.73333579+0.j 0.25243711+0.j 0.94475841+0.j
0.78098459+0.j]
abs might be a different case because fabs exists. The one that causes issues for us is real
, because there is no freal
. I accept that it's odd to call real on a real value, but we do it...
Aah yep! Just tried the earlier reproducer with real
(and imag
), it fails. Found the MWE! Thanks!
Thanks for this, both!