/genfft

A code parser/generator to generate Go from annotated lists created with the FFTW tool genfft.

Primary LanguageGo

genfft

A tool to generate short, hard-coded DFT's in Go from annotated lists created with the FFTW tool genfft.

AGPL License

Install

Download and install the tool:

go get github.com/bemasher/genfft

Build the FFTW genfft tool found in the FFTW source:

cd fftw3/genfft
ocamlbuild -classic-display -libs unix,nums gen_notw.native gen_notw_c.native

Usage

Two flavors of DFT's are supported. Both compute short fixed-length complex-to-complex no-twiddle DFT's. One takes real and imaginary input as separate arrays, one set each for input and output. The other takes complex arrays for input and output. The transform signature is determined from the annotated list generated by each tool.

func DFT(ri, ii, ro, io []float64) // gen_notw.native
func DFT(xi, xo []complex128)      // gen_notw_c.native

Generate an annotated transform:

N=3; gen_notw_c.native -n ${N} -with-istride 1 -with-ostride 1 -standalone -dump-asched cmplx_${N}.alst cmplx_${N}.cout

This will generate and write the transform's operations and constants to the file cmplx_3.alst:

(:= T1 xi[0])
(:= T2 xi[1])
(:= T3 xi[2])
(:= T4 (+ T2 T3))
(:= T6 (* I (* KP866025403 (+ T3 (- T2)))))
(:= xo[0] (+ T1 T4))
(:= T5 (+ T1 (- (* KP500000000 T4))))
(:= xo[2] (+ T5 (- T6)))
(:= xo[1] (+ T5 T6))

Constants required for the transform are found in cmplx_3.cout, which is parsed only for lines prefixed by DK and DVK.

DVK(KP500000000, +0.500000000000000000000000000000000000000000000);
DVK(KP866025403, +0.866025403784438646763723170752936183471402627);

The tool reads information about DFT's to transform from config.json. To transform the size 3 complex DFT, config.json should contain:

[{ "prefix": "cmplx_3", "func": "DftCmplx3" }]

Executing genfft with this config produces the output file cmplx_3.go containing a function named DftCmplx3:

package dft

func DftCmplx3(xi, xo []complex128) {
	const (
		I           = 1i
		KP500000000 = +0.500000000000000000000000000000000000000000000
		KP866025403 = +0.866025403784438646763723170752936183471402627
	)

	T1 := xi[0]
	T2 := xi[1]
	T3 := xi[2]
	T4 := T2 + T3
	T6 := I * (KP866025403 * (T3 - T2))
	xo[0] = T1 + T4
	T5 := T1 - KP500000000*T4
	xo[2] = T5 - T6
	xo[1] = T5 + T6
}

Transforms safely perform in-place and out-of-place transforms depending on the function arguments.

Tests compare output of a naive DFT to output of the generated DFT's. DFT's pass when they are error is within a specified tolerance. See dft/dft_test.go for more details.

A benchmark is also provided to compare performance of the various transforms.

11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz + 32GB DDR4:

goos: windows
goarch: amd64
pkg: github.com/bemasher/genfft/dft
cpu: 11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz
BenchmarkFloatDFT/Naive_DFT_N=12-12                       815853    1474    ns/op      8.14 MB/s    192 B/op     1 allocs/op
BenchmarkFloatDFT/In_Place_Float_DFT_N=12-12            70460990      17.29 ns/op    694.13 MB/s      0 B/op     0 allocs/op
BenchmarkFloatDFT/Out_of_Place_Float_DFT_N=12-12        70753047      17.00 ns/op    705.75 MB/s      0 B/op     0 allocs/op
BenchmarkCmplxDFT/Naive_Cmplx_DFT_N=12-12                 800121    1470    ns/op      8.16 MB/s    192 B/op     1 allocs/op
BenchmarkCmplxDFT/In_Place_Cmplx_DFT_N=12-12            59896080      19.69 ns/op    609.57 MB/s      0 B/op     0 allocs/op
BenchmarkCmplxDFT/Out_of_Cmplx_Place_DFT_N=12-12        60725056      19.08 ns/op    628.92 MB/s      0 B/op     0 allocs/op
PASS
ok      github.com/bemasher/genfft/dft  7.316s

Raspberry Pi 2 Model B Rev 1.1:

goos: linux
goarch: arm
pkg: github.com/bemasher/genfft/dft
BenchmarkFloatDFT/Naive_DFT_N=12-4         	                   15980    74300   ns/op     0.16 MB/s    192 B/op    1 allocs/op
BenchmarkFloatDFT/In_Place_Float_DFT_N=12-4                  1972201      606.0 ns/op    19.80 MB/s      0 B/op    0 allocs/op
BenchmarkFloatDFT/Out_of_Place_Float_DFT_N=12-4              1959711      611.5 ns/op    19.62 MB/s      0 B/op    0 allocs/op
BenchmarkCmplxDFT/Naive_Cmplx_DFT_N=12-4                       16048    74256   ns/op     0.16 MB/s    192 B/op    1 allocs/op
BenchmarkCmplxDFT/In_Place_Cmplx_DFT_N=12-4                  1490419      803.4 ns/op    14.94 MB/s      0 B/op    0 allocs/op
BenchmarkCmplxDFT/Out_of_Cmplx_Place_DFT_N=12-4              1476600      806.1 ns/op    14.89 MB/s      0 B/op    0 allocs/op
PASS
ok      github.com/bemasher/genfft/dft	11.639s