the most common fast Fourier transform (FFT) algorithm
Cooley–Tukey re-expresses the discrete Fourier transform (DFT) of an arbitrary composite size $N = N_1N_2$ in terms of smaller DFTs of sizes $N_1$ and $N_2$, recursively, in order to reduce the computation time to $O(N log N)$ for highly composite N (smooth numbers).
####The radix-2 DIT case
A radix-2 decimation-in-time (DIT) FFT is the simplest and most common form of the Cooley–Tukey algorithm, although highly optimized Cooley–Tukey implementations typically use other forms of the algorithm as described below. Radix-2 DIT divides a DFT of size N into two interleaved DFTs (hence the name "radix-2") of size $N/2$ with each recursive stage.
The discrete Fourier transform (DFT) is defined by the formula:
Radix-2 DIT first computes the DFTs of the even-indexed inputs $(x_{2m}=x_0, x_2, \ldots, x_{N-2})$ and of the odd-indexed inputs $(x_{2m+1}=x_1, x_3, \ldots, x_{N-1})$, and then combines those two results to produce the DFT of the whole sequence. This idea can then be performed recursively to reduce the overall runtime to $O(N log N)$. This simplified form assumes that N is a power of two; since the number of sample points $N$ can usually be chosen freely by the application, this is often not an important restriction.
The Radix-2 DIT algorithm rearranges the DFT of the function $x_n$ into two parts: a sum over the even-numbered indices $n={2m}$ and a sum over the odd-numbered indices $n={2m+1}$:
One can factor a common multiplier $e^{-\frac{2\pi i}{N}k}$ out of the second sum, as shown in the equation below. It is then clear that the two sums are the DFT of the even-indexed part $x_{2m}$ and the DFT of odd-indexed part $x_{2m+1}$ of the function $x_n$. Denote the DFT of the Even-indexed inputs $x_{2m}$ by $E_k$ and the DFT of the Odd-indexed inputs $x_{2m + 1}$ by $O_k$ and we obtain:
This result, expressing the DFT of length $N$ recursively in terms of two DFTs of size $N/2$, is the core of the radix-2 DIT fast Fourier transform. The algorithm gains its speed by re-using the results of intermediate computations to compute multiple DFT outputs. Note that final outputs are obtained by a +/− combination of $E_k$ and $O_k \exp(-2\pi i k/N)$, which is simply a size-2 DFT (sometimes called a butterfly in this context); when this is generalized to larger radices below, the size-2 DFT is replaced by a larger DFT (which itself can be evaluated with an FFT).
This process is an example of the general technique of divide and conquer algorithms; in many traditional implementations, however, the explicit recursion is avoided, and instead one traverses the computational tree in breadth-first fashion.