`--enable-generic-simd256` causes memory error on `fftw_plan_many_dft_r2c` and `fftw_plan_many_dft_c2`
maxmarsc opened this issue · 0 comments
maxmarsc commented
First of all I think this issue might be related to :
I compiled both fftw
and fftwf
3.3.10 for x86_64, using GCC 9, with the following flags :
--enable-avx
--enable-avx2
--enable-avx512
--enable-avx-128-fma
--enable-generic-simd128
--enable-generic-simd256
The issue I identified only happened with fftw
(not fftwf
)
The code to reproduce the bug would be :
#include "fftw3.h"
#include <stdlib.h>
#include <cmath>
int main() {
int fft_size = 256;
int channels = 1;
int transform_size = std::floor(fft_size / 2) + 1;
double* inplace_work_buffer = fftw_alloc_real(channels * transform_size * 2);
int rank = 1; /* we are computing 1d transforms */
int n[] = {fft_size}; /* 1d transforms of length fftTransformSize */
int howmany = channels; /* how many transforms to compute */
int idist = transform_size * 2;
int odist = transform_size;
int istride = 1;
int ostride = 1;
int* inembed = nullptr;
int* onembed = nullptr;
auto* plan = fftw_plan_many_dft_r2c(rank, n, howmany, inplace_work_buffer, inembed,
istride, idist,
reinterpret_cast<fftw_complex*>(inplace_work_buffer),
onembed, ostride, odist, FFTW_MEASURE);
fftw_destroy_plan(plan);
fftw_free(inplace_work_buffer);
}
When running with ASan, here is the output it gives :
=================================================================
==1185224==ERROR: AddressSanitizer: unknown-crash on address 0x612000000430 at pc 0x5629290259c4 bp 0x7ffdc0967a50 sp 0x7ffdc0967a40
READ of size 32 at 0x612000000430 thread T0
#0 0x5629290259c3 in LDA /foo/bar/build/source/stft/fftwf/src/fftwf/simd-support/simd-generic256.h:60
#1 0x562929026df1 in n2fv_16 /foo/bar/build/source/stft/fftwf/src/fftwf/dft/simd/generic-simd256/../common/n2fv_16.c:284
#2 0x56292936dcd6 in apply_extra_iter /foo/bar/build/source/stft/fftwf/src/fftwf/dft/direct.c:111
#3 0x562927eef746 in fftw_dft_solve /foo/bar/build/source/stft/fftwf/src/fftwf/dft/solve.c:29
#4 0x562927edeb8c in measure /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/timer.c:136
#5 0x562927eded07 in fftw_measure_execution_time /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/timer.c:159
#6 0x562927ed9376 in evaluate_plan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:460
#7 0x562927ed9cd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#8 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#9 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#10 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
#11 0x562927edd7bf in fftw_mkplan_f_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:986
#12 0x562927eeb443 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/indirect.c:206
#13 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#14 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#15 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#16 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#17 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
#18 0x562927edd7bf in fftw_mkplan_f_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:986
#19 0x562929355373 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/buffered.c:199
#20 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#21 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#22 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#23 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#24 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
#25 0x5629293746c1 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:198
#26 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#27 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#28 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#29 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#30 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
#31 0x5629293727b5 in mkcldw /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c-direct.c:334
#32 0x56292937409c in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:173
#33 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#34 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#35 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#36 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#37 0x562927ed358c in mkplan0 /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:42
#38 0x562927ed35db in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:56
#39 0x562927ed39ca in fftw_mkapiplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:124
#40 0x562927ed60a9 in fftw_plan_many_dft_r2c /foo/bar/build/source/stft/fftwf/src/fftwf/api/plan-many-dft-r2c.c:41
#41 0x5629267f1666 in CATCH2_INTERNAL_TEST_4 /foo/bar/tests/fft_tests.cc:55
#42 0x56292688a6bd in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
#43 0x56292687e866 in Catch::TestCaseHandle::invoke() const (/foo/bar/build/tests/libstft_tests+0x269866)
#44 0x56292687d9bb in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:508
#45 0x56292687d6f5 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:473
#46 0x56292687bfde in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:238
#47 0x562926828373 in execute src/catch2/catch_session.cpp:110
#48 0x5629268297b3 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
#49 0x5629268292cc in Catch::Session::run() src/catch2/catch_session.cpp:263
#50 0x5629268211e6 in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
#51 0x5629268210d4 in main src/catch2/internal/catch_main.cpp:36
#52 0x7fe9cf443082 in __libc_start_main ../csu/libc-start.c:308
#53 0x5629267f02bd in _start (/foo/bar/build/tests/libstft_tests+0x1db2bd)
0x612000000440 is located 0 bytes to the right of 256-byte region [0x612000000340,0x612000000440)
allocated by thread T0 here:
#0 0x7fe9cfa6b005 in __interceptor_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:169
#1 0x562927ed67ea in fftw_kernel_malloc /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/kalloc.c:91
#2 0x562927ed6548 in fftw_malloc_plain /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/alloc.c:28
#3 0x5629293550b9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/buffered.c:196
#4 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#5 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#6 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#7 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#8 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
#9 0x5629293746c1 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:198
#10 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#11 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#12 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#13 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#14 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
#15 0x5629293727b5 in mkcldw /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c-direct.c:334
#16 0x56292937409c in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:173
#17 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
#18 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
#19 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
#20 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
#21 0x562927ed358c in mkplan0 /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:42
#22 0x562927ed35db in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:56
#23 0x562927ed39ca in fftw_mkapiplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:124
#24 0x562927ed60a9 in fftw_plan_many_dft_r2c /foo/bar/build/source/stft/fftwf/src/fftwf/api/plan-many-dft-r2c.c:41
#25 0x5629267f1666 in CATCH2_INTERNAL_TEST_4 /foo/bar/tests/fft_tests.cc:55
#26 0x56292688a6bd in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
#27 0x56292687e866 in Catch::TestCaseHandle::invoke() const (/foo/bar/build/tests/libstft_tests+0x269866)
#28 0x56292687d9bb in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:508
#29 0x56292687d6f5 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:473
SUMMARY: AddressSanitizer: unknown-crash /foo/bar/build/source/stft/fftwf/src/fftwf/simd-support/simd-generic256.h:60 in LDA
Shadow bytes around the buggy address:
0x0c247fff8030: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c247fff8040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c247fff8050: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
0x0c247fff8060: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c247fff8070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c247fff8080: 00 00 00 00 00 00[00]00 fa fa fa fa fa fa fa fa
0x0c247fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c247fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c247fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c247fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c247fff80d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==1185224==ABORTING
And valgrind --leak-check=full
gives me :
==1280516== Memcheck, a memory error detector
==1280516== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1280516== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1280516== Command: ./build/tests/libstft_tests bug\ report
==1280516==
==1280516== Invalid read of size 8
==1280516== at 0x21B279A: LDA (simd-generic256.h:60)
==1280516== by 0x21B36C4: n2fv_16 (n2fv_16.c:284)
==1280516== by 0x24920C3: apply_extra_iter (direct.c:111)
==1280516== by 0x13B8A3E: fftw_dft_solve (solve.c:29)
==1280516== by 0x13B13B6: measure (timer.c:136)
==1280516== by 0x13B1468: fftw_measure_execution_time (timer.c:159)
==1280516== by 0x13AF1DA: evaluate_plan (planner.c:460)
==1280516== by 0x13AF4E3: search0 (planner.c:529)
==1280516== by 0x13AF695: search (planner.c:600)
==1280516== by 0x13AFAB3: mkplan (planner.c:711)
==1280516== by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516== by 0x13B088B: fftw_mkplan_f_d (planner.c:986)
==1280516== Address 0x4fcc900 is 0 bytes after a block of size 256 alloc'd
==1280516== at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1280516== by 0x13AE134: fftw_kernel_malloc (kalloc.c:91)
==1280516== by 0x13ADFFB: fftw_malloc_plain (alloc.c:28)
==1280516== by 0x24858DC: mkplan (buffered.c:196)
==1280516== by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516== by 0x13AF45B: search0 (planner.c:529)
==1280516== by 0x13AF695: search (planner.c:600)
==1280516== by 0x13AFAB3: mkplan (planner.c:711)
==1280516== by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516== by 0x2494DE4: mkplan (ct-hc2c.c:198)
==1280516== by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516== by 0x13AF45B: search0 (planner.c:529)
==1280516==
==1280516== Invalid read of size 8
==1280516== at 0x21B279E: LDA (simd-generic256.h:60)
==1280516== by 0x21B36C4: n2fv_16 (n2fv_16.c:284)
==1280516== by 0x24920C3: apply_extra_iter (direct.c:111)
==1280516== by 0x13B8A3E: fftw_dft_solve (solve.c:29)
==1280516== by 0x13B13B6: measure (timer.c:136)
==1280516== by 0x13B1468: fftw_measure_execution_time (timer.c:159)
==1280516== by 0x13AF1DA: evaluate_plan (planner.c:460)
==1280516== by 0x13AF4E3: search0 (planner.c:529)
==1280516== by 0x13AF695: search (planner.c:600)
==1280516== by 0x13AFAB3: mkplan (planner.c:711)
==1280516== by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516== by 0x13B088B: fftw_mkplan_f_d (planner.c:986)
==1280516== Address 0x4fcc908 is 8 bytes after a block of size 256 alloc'd
==1280516== at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1280516== by 0x13AE134: fftw_kernel_malloc (kalloc.c:91)
==1280516== by 0x13ADFFB: fftw_malloc_plain (alloc.c:28)
==1280516== by 0x24858DC: mkplan (buffered.c:196)
==1280516== by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516== by 0x13AF45B: search0 (planner.c:529)
==1280516== by 0x13AF695: search (planner.c:600)
==1280516== by 0x13AFAB3: mkplan (planner.c:711)
==1280516== by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516== by 0x2494DE4: mkplan (ct-hc2c.c:198)
==1280516== by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516== by 0x13AF45B: search0 (planner.c:529)
==1280516==
==1280516==
==1280516== HEAP SUMMARY:
==1280516== in use at exit: 226,376 bytes in 2,457 blocks
==1280516== total heap usage: 58,871 allocs, 56,414 frees, 34,196,978 bytes allocated
==1280516==
==1280516== LEAK SUMMARY:
==1280516== definitely lost: 0 bytes in 0 blocks
==1280516== indirectly lost: 0 bytes in 0 blocks
==1280516== possibly lost: 0 bytes in 0 blocks
==1280516== still reachable: 226,376 bytes in 2,457 blocks
==1280516== suppressed: 0 bytes in 0 blocks
==1280516== Reachable blocks (those to which a pointer was found) are not shown.
==1280516== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1280516==
==1280516== For lists of detected and suppressed errors, rerun with: -s
==1280516== ERROR SUMMARY: 16 errors from 2 contexts (suppressed: 0 from 0)
Note : you can see in the stack that I'm using catch2 rather than having the code inside a main
function, but using a main function would reproduce the issue
Some more details I gathered
- Same happens with the inverse with
fftw_plan_many_dft_c2r
with the same setup (simply switching odist and idist values) - Rebuilding without the
--enable-generic-simd256
flag removes the issue - With fft_size values of 32, 64, 128, the bug does not appears