Overview

Stage-based FFT APIs

The AIE API offers a stage-based interface for carrying out decimation-in-time FFTs. For example, assuming twiddle pointer visibility (see Twiddle Generation below), a 1024 point FFT can be computed as follows:

void fft_1024pt(const cint16 * __restrict x,  // Input pointer
                unsigned shift_tw,            // Indicates the decimal point of the twiddles
                                              // e.g. The twiddle 1.0+0.0i can be represented with cint16(32767, 0) and a shift_tw of 15
                unsigned shift,               // Shift applied to apply to dit outputs
                bool inv,                     // Run inverse FFT
                cint16 * __restrict tmp,      // Scratch space for intermediate results
                cint16 * __restrict y         // Output pointer
               )
{
    aie::fft_dit_r2_stage<512>(x,   tw1,   1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<256>(tmp, tw2,   1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<128>(y,   tw4,   1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<64> (tmp, tw8,   1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<32> (y,   tw16,  1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<16> (tmp, tw32,  1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<8>  (y,   tw64,  1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<4>  (tmp, tw128, 1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<2>  (y,   tw256, 1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<1>  (tmp, tw512, 1024, shift_tw, shift, inv, y);
}

Similarly, a 512 point FFT can be implemented, using a mix of radix-2 and radix-4 stages, as follows:

void fft_512pt(const cint16 * __restrict x,  // Input pointer
               unsigned shift_tw,            // Indicates the decimal point of the twiddles
                                             // e.g. The twiddle 1.0+0.0i can be represented with cint16(32767, 0) and a shift_tw of 15
               unsigned shift,               // Shift applied to apply to dit outputs
               bool inv,                     // Run inverse FFT
               cint16 * __restrict tmp,      // Scratch space for intermediate results
               cint16 * __restrict y         // Output pointer
               )
{
    aie::fft_dit_r2_stage<256>(x,   tw1,                     512, shift_tw, shift, inv, y);
    aie::fft_dit_r4_stage<64> (y,   tw2,   tw4,   tw2_4,     512, shift_tw, shift, inv, tmp);
    aie::fft_dit_r4_stage<16> (tmp, tw8,   tw16,  tw8_16,    512, shift_tw, shift, inv, y);
    aie::fft_dit_r4_stage<4>  (y,   tw32,  tw64,  tw32_64,   512, shift_tw, shift, inv, tmp);
    aie::fft_dit_r4_stage<1>  (tmp, tw128, tw256, tw128_256, 512, shift_tw, shift, inv, y);
}

Note: For an odd number of stages the input buffer may be used in place of the tmp, which could be of benefit for large FFTs.; The order of the twiddle arguments are outlined in the description of each FFT stage function: aie::fft_dit_r2_stage, aie::fft_dit_r3_stage, aie::fft_dit_r4_stage, aie::fft_dit_r5_stage

Twiddle Generation

An R-Radix, N-point FFT requires R-1 twiddle tables per stage.

Each of the tables are of length (n_stage / R), where n_stage is the local number of samples of the current radix stage. The local number of samples is given as the total point size, N, divided by the Vectorization, which is the template parameter of the fft_dit_r*_stage function calls. This is due to the fact that earlier stages of an N-point FFT are smaller, batched FFTs.

For each stage, the twiddle tables can be computed, in floating point, as:

int n_stage = N / Vectorization;
int n_tws   = n_stage / Radix;
for (unsigned r = 1; r < Radix; ++r) {
    for (unsigned i = 0; i < n_tws; ++i) {
        tw[r-1][i] = exp(-2j * pi * r * i / n);
    }
}

and the equivalent python code:

import numpy as np
 
def tw(n, radix, vec):
    n_stage = n / vec
    points = n_stage / radix
    return np.exp(-2j * np.pi * np.arange(1, radix).reshape(-1,1) * np.arange(0, points) / n_stage)

For fixed point implementations, the twiddle values should be multiplied by (1 << shift_tw) before converting to the output type. For example,

template <typename TR, typename T>
TR convert_twiddle_to_fixed_point(T val, unsigned shift_tw) {
    return aie::to_fixed<TR>(val * (1 << shift_tw));
}
 
// Required to prevent overflow on conversions
aie::set_rounding(aie::rounding_mode::positive_inf);
aie::set_saturation(aie::saturation_mode::saturate);
 
unsigned shift_tw = 15;
cfloat tw = cfloat(1.0f, 0.0f);
cint16 tw_fixed = convert_twiddle_to_fixed_point<cint16>(tw, shift_tw);
// tw_fixed = 32767 + 0j

Full FFT Example

Using the method of generating twiddles outlined in Twiddle Generation, a 128pt FFT can be computed as follows:

Note: Note the order of the twiddles to the radix 4 stages are not in increasing order; i.e. defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are w(tw1) < w(tw0) < w(tw2). This is not the case for other radix stages, where the rotation rate is monotonically increasing; i.e. w(tw1) < ... < w(twN).

constexpr unsigned        n = 128;
constexpr unsigned shift_tw = 15;
constexpr unsigned    shift = 15;
constexpr bool          inv = false;
 
alignas(aie::vector_decl_align) static cint32 x[n];
alignas(aie::vector_decl_align) static cint32 tmp[n];
alignas(aie::vector_decl_align) static cint32 y[n];
 
alignas(aie::vector_decl_align) static cint16 tw1    [] = {{ 32767,      0}};
alignas(aie::vector_decl_align) static cint16 tw2    [] = {{ 32767,      0}, {     0, -32768}};
alignas(aie::vector_decl_align) static cint16 tw4    [] = {{ 32767,      0}, { 23170, -23170}};
alignas(aie::vector_decl_align) static cint16 tw2_4  [] = {{ 32767,      0}, {-23170, -23170}};
alignas(aie::vector_decl_align) static cint16 tw8    [] = {{ 32767,      0}, { 30273, -12539}, { 23170, -23170}, { 12539, -30273},
                                                           {     0, -32768}, {-12539, -30273}, {-23170, -23170}, {-30273, -12539}};
alignas(aie::vector_decl_align) static cint16 tw16   [] = {{ 32767,      0}, { 32138,  -6392}, { 30273, -12539}, { 27245, -18204},
                                                           { 23170, -23170}, { 18204, -27245}, { 12539, -30273}, {  6392, -32138}};
alignas(aie::vector_decl_align) static cint16 tw8_16 [] = {{ 32767,      0}, { 27245, -18204}, { 12539, -30273}, { -6392, -32138},
                                                           {-23170, -23170}, {-32138,  -6392}, {-30273,  12539}, {-18204,  27245}};
alignas(aie::vector_decl_align) static cint16 tw32   [] = {{ 32767,      0}, { 32610,  -3211}, { 32138,  -6392}, { 31357,  -9512},
                                                           { 30273, -12539}, { 28898, -15446}, { 27245, -18204}, { 25330, -20787},
                                                           { 23170, -23170}, { 20787, -25330}, { 18204, -27245}, { 15446, -28898},
                                                           { 12539, -30273}, {  9512, -31357}, {  6392, -32138}, {  3211, -32610},
                                                           {     0, -32768}, { -3211, -32610}, { -6392, -32138}, { -9512, -31357},
                                                           {-12539, -30273}, {-15446, -28898}, {-18204, -27245}, {-20787, -25330},
                                                           {-23170, -23170}, {-25330, -20787}, {-27245, -18204}, {-28898, -15446},
                                                           {-30273, -12539}, {-31357,  -9512}, {-32138,  -6392}, {-32610,  -3211}};
alignas(aie::vector_decl_align) static cint16 tw64   [] = {{ 32767,      0}, { 32728,  -1607}, { 32610,  -3211}, { 32413,  -4808},
                                                           { 32138,  -6392}, { 31785,  -7961}, { 31357,  -9512}, { 30852, -11039},
                                                           { 30273, -12539}, { 29621, -14010}, { 28898, -15446}, { 28106, -16846},
                                                           { 27245, -18204}, { 26319, -19519}, { 25330, -20787}, { 24279, -22005},
                                                           { 23170, -23170}, { 22005, -24279}, { 20787, -25330}, { 19519, -26319},
                                                           { 18204, -27245}, { 16846, -28106}, { 15446, -28898}, { 14010, -29621},
                                                           { 12539, -30273}, { 11039, -30852}, {  9512, -31357}, {  7961, -31785},
                                                           {  6392, -32138}, {  4808, -32413}, {  3211, -32610}, {  1607, -32728}};
alignas(aie::vector_decl_align) static cint16 tw32_64[] = {{ 32767,      0}, { 32413,  -4808}, { 31357,  -9512}, { 29621, -14010},
                                                           { 27245, -18204}, { 24279, -22005}, { 20787, -25330}, { 16846, -28106},
                                                           { 12539, -30273}, {  7961, -31785}, {  3211, -32610}, { -1607, -32728},
                                                           { -6392, -32138}, {-11039, -30852}, {-15446, -28898}, {-19519, -26319},
                                                           {-23170, -23170}, {-26319, -19519}, {-28898, -15446}, {-30852, -11039},
                                                           {-32138,  -6392}, {-32728,  -1607}, {-32610,   3211}, {-31785,   7961},
                                                           {-30273,  12539}, {-28106,  16846}, {-25330,  20787}, {-22005,  24279},
                                                           {-18204,  27245}, {-14010,  29621}, { -9512,  31357}, { -4808,  32413}};
 
// Constant value input
for (unsigned i = 0; i < n; ++i) {
    x[i] = cint32(1, 0);
}
 
aie::fft_dit_r2_stage<64>(x,   tw1,                   n, shift_tw, shift, inv, tmp);
aie::fft_dit_r4_stage<16>(tmp, tw2,   tw4,   tw2_4,   n, shift_tw, shift, inv, y);
aie::fft_dit_r4_stage<4> (y,   tw8,   tw16,  tw8_16,  n, shift_tw, shift, inv, tmp);
aie::fft_dit_r4_stage<1> (tmp, tw32,  tw64,  tw32_64, n, shift_tw, shift, inv, y);
 
for (unsigned i = 0; i < n; ++i) {
    printf("{%d, %d} ", y[i].real, y[i].imag);
}
// Will print:
// {128, 0} {0, 0} {0, 0} ... {0, 0}

Typedefs
template<unsigned Vectorization, unsigned Radix, typename Input , typename Output = Input, typename Twiddle = detail::default_twiddle_type_t<Input, Output>>
using	aie::fft_dit = detail::fft_dit< Vectorization, detail::fft_get_stage< Input, Output, Twiddle >(Radix, Vectorization), Radix, Input, Output, Twiddle >
	Type that encapsulates the functionality for decimation-in-time FFTs.

Functions
template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE, arch::AIE_ML) && detail::is_floating_point_v<Input>)
void	aie::fft_dit_r2_stage (const Input __restrict x, const Twiddle __restrict tw, unsigned n, bool inv, Output *__restrict out)
	A function to perform a single floating point radix 2 FFT stage.

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::Gen1, arch::Gen2))
void	aie::fft_dit_r2_stage (const Input __restrict x, const Twiddle __restrict tw, unsigned n, unsigned shift_tw, unsigned shift, bool inv, Output *__restrict out)
	A function to perform a single radix 2 FFT stage.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)
void	aie::fft_dit_r2_stage (const Input __restrict x, const Twiddle __restrict tw, unsigned n, unsigned vectorization, bool inv, Output *__restrict out)
	A function to perform a single floating point radix 2 FFT stage with dynamic vectorization.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE))
void	aie::fft_dit_r2_stage (const Input __restrict x, const Twiddle __restrict tw, unsigned n, unsigned vectorization, unsigned shift_tw, unsigned shift, bool inv, Output *__restrict out)
	A function to perform a single radix 2 FFT stage with dynamic vectorization.

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)
void	aie::fft_dit_r3_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, unsigned n, bool inv, Output __restrict out)
	A function to perform a single floating point radix 3 FFT stage.

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE, arch::AIE_ML))
void	aie::fft_dit_r3_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, unsigned n, unsigned shift_tw, unsigned shift, bool inv, Output __restrict out)
	A function to perform a single radix 3 FFT stage.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)
void	aie::fft_dit_r3_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, unsigned n, unsigned vectorization, bool inv, Output __restrict out)
	A function to perform a single floating point radix 3 FFT stage with dynamic vectorization.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE))
void	aie::fft_dit_r3_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, unsigned n, unsigned vectorization, unsigned shift_tw, unsigned shift, bool inv, Output __restrict out)
	A function to perform a single radix 3 FFT stage with dynamic vectorization.

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::Gen1, arch::Gen2))
void	aie::fft_dit_r4_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, const Twiddle __restrict tw2, unsigned n, unsigned shift_tw, unsigned shift, bool inv, Output *__restrict out)
	A function to perform a single radix 4 FFT stage.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE))
void	aie::fft_dit_r4_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, const Twiddle __restrict tw2, unsigned n, unsigned vectorization, unsigned shift_tw, unsigned shift, bool inv, Output *__restrict out)
	A function to perform a single radix 4 FFT stage with dynamic vectorization.

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)
void	aie::fft_dit_r5_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, const Twiddle __restrict tw2, const Twiddle __restrict tw3, unsigned n, bool inv, Output __restrict out)
	A function to perform a single floating point radix 5 FFT stage.

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE, arch::AIE_ML))
void	aie::fft_dit_r5_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, const Twiddle __restrict tw2, const Twiddle __restrict tw3, unsigned n, unsigned shift_tw, unsigned shift, bool inv, Output out)
	A function to perform a single radix 5 FFT stage.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)
void	aie::fft_dit_r5_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, const Twiddle __restrict tw2, const Twiddle __restrict tw3, unsigned n, unsigned vectorization, bool inv, Output __restrict out)
	A function to perform a single floating point radix 5 FFT stage with dynamic vectorization.

template<typename Input , typename Output , typename Twiddle > requires (arch::is(arch::AIE))
void	aie::fft_dit_r5_stage (const Input __restrict x, const Twiddle __restrict tw0, const Twiddle __restrict tw1, const Twiddle __restrict tw2, const Twiddle __restrict tw3, unsigned n, unsigned vectorization, unsigned shift_tw, unsigned shift, bool inv, Output out)
	A function to perform a single radix 5 FFT stage with dynamic vectorization.

Supported Fast Fourier Transform Modes

Supported FFT/IFFT Modes
Input Type	Output Type	Twiddle Type	AIE Supported Radices	AIE-ML/XDNA 1 Supported Radices
c16b	c16b	c16b	2, 3, 4, 5	2, 3, 4, 5
c16b	c32b	c16b	2, 3, 4, 5	2, 3, 4, 5
c32b	c16b	c16b	2, 3, 4, 5	2, 3, 4, 5
c32b	c32b	c16b	2, 3, 4, 5	2, 3, 4, 5
c16b	c32b	c32b	2
c32b	c16b	c32b	2
c32b	c32b	c32b	2, 3, 4, 5
cbfloat16	cbfloat16	cbfloat16		2, 4
cfloat	cfloat	cfloat	2, 3, 5

Note

Odd-radix FFT stages are only available for vectorization values greater than or equal to the underlying output vector sizes.

Underlying output vector sizes
Input Type	Output Type	Twiddle Type	AIE Output Vector Size	AIE-ML/XDNA 1 Output Vector Size
c16b	c16b	c16b	4 (8 for radix 2)	8
c16b	c32b	c16b	4 (8 for radix 2)	8
c32b	c16b	c16b	4	8
c32b	c32b	c16b	4	8
c16b	c32b	c32b	4
c32b	c16b	c32b	4
c32b	c32b	c32b	2
cbfloat16	cbfloat16	cbfloat16		8
cfloat	cfloat	cfloat	4

The minimum point size supported by an FFT is given by the product of the radix with the underlying output vector size. This is due to the fact that a radix R FFT will use R output pointers, each writing an amount of data equal to the underlying output vector size. For example, a radix 4 FFT with cint16 input data, cint32 output data, and cint16 twiddles will have a minimum point size of 16 (4 * 4) on AIE while the minimum point size on AIE-ML/XDNA 1 will be 32 (4 * 8).

Typedef Documentation

◆ fft_dit

template<unsigned Vectorization, unsigned Radix, typename Input , typename Output = Input, typename Twiddle = detail::default_twiddle_type_t<Input, Output>>

using aie::fft_dit = typedef detail::fft_dit<Vectorization, detail::fft_get_stage<Input, Output, Twiddle>(Radix, Vectorization), Radix, Input, Output, Twiddle>

Type that encapsulates the functionality for decimation-in-time FFTs.

Deprecated:: The iterator interface is deprecated and the stage-based interface should be preferred. For example, where a user would previously need to define an FFT stage as follows:

template <unsigned Vectorization>
void radix2_dit(const cint32 * __restrict x,
                const cint16 * __restrict tw,
                unsigned n,  unsigned shift_tw, unsigned shift, bool inv,
                cint32 * __restrict y)
{
    using FFT = aie::fft_dit<Vectorization, 2, cint32>;
 
    FFT fft;
 
    auto it_stage  = fft.begin_stage(x, tw);
    auto it_out0 = aie::begin_vector<FFT::out_vector_size>(y);
    auto it_out1 = aie::begin_vector<FFT::out_vector_size>(y + n / 2);
 
    for (int j = 0; j < n / (2 * FFT::out_vector_size); ++j)
        chess_prepare_for_pipelining
        chess_loop_range(1,)
    {
        const auto out = fft.dit(*it_stage++, shift_tw, shift, inv);
        *it_out0++ = out[0];
        *it_out1++ = out[1];
    }
}

the user may now replace all calls to the user-defined function function with the equivalent defined by the API:

aie::fft_dit_r2_stage<Vectorization>(x, tw, n, shift_tw, shift, inv, y);

Template Parameters

Vectorization	Vectorization of the FFT stage
Radix	Number which selects the FFT radix.
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

See also: fft_dit_r2_stage, fft_dit_r3_stage, fft_dit_r4_stage, fft_dit_r5_stage

Function Documentation

◆ fft_dit_r2_stage() [1/4]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE, arch::AIE_ML) && detail::is_floating_point_v<Input>)

void aie::fft_dit_r2_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw,
		unsigned	n,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single floating point radix 2 FFT stage.

Parameters

x	Input data pointer
tw	Twiddle group pointer
n	Number of samples
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r2_stage() [2/4]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::Gen1, arch::Gen2))

void aie::fft_dit_r2_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw,
		unsigned	n,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single radix 2 FFT stage.

Parameters

x	Input data pointer
tw	Twiddle group pointer
n	Number of samples
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r2_stage() [3/4]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)

void aie::fft_dit_r2_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw,
		unsigned	n,
		unsigned	vectorization,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single floating point radix 2 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Parameters

x	Input data pointer
tw	Twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r2_stage() [4/4]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE))

void aie::fft_dit_r2_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw,
		unsigned	n,
		unsigned	vectorization,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single radix 2 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Parameters

x	Input data pointer
tw	Twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r3_stage() [1/4]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)

void aie::fft_dit_r3_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		unsigned	n,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single floating point radix 3 FFT stage.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
n	Number of samples
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r3_stage() [2/4]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE, arch::AIE_ML))

void aie::fft_dit_r3_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		unsigned	n,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single radix 3 FFT stage.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
n	Number of samples
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r3_stage() [3/4]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)

void aie::fft_dit_r3_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		unsigned	n,
		unsigned	vectorization,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single floating point radix 3 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r3_stage() [4/4]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE))

void aie::fft_dit_r3_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		unsigned	n,
		unsigned	vectorization,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single radix 3 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r4_stage() [1/2]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::Gen1, arch::Gen2))

void aie::fft_dit_r4_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		const Twiddle *__restrict	tw2,
		unsigned	n,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single radix 4 FFT stage.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw1) < w(tw0) < w(tw2)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
tw2	Third twiddle group pointer
n	Number of samples
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r4_stage() [2/2]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE))

void aie::fft_dit_r4_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		const Twiddle *__restrict	tw2,
		unsigned	n,
		unsigned	vectorization,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single radix 4 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw1) < w(tw0) < w(tw2)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
tw2	Third twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r5_stage() [1/4]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)

void aie::fft_dit_r5_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		const Twiddle *__restrict	tw2,
		const Twiddle *__restrict	tw3,
		unsigned	n,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single floating point radix 5 FFT stage.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1) < w(tw2) < w(tw3)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
tw2	Third twiddle group pointer
tw3	Fourth twiddle group pointer
n	Number of samples
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r5_stage() [2/4]

template<unsigned Vectorization, typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE, arch::AIE_ML))

void aie::fft_dit_r5_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		const Twiddle *__restrict	tw2,
		const Twiddle *__restrict	tw3,
		unsigned	n,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *	out
	)

A function to perform a single radix 5 FFT stage.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1) < w(tw2) < w(tw3)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
tw2	Third twiddle group pointer
tw3	Fourth twiddle group pointer
n	Number of samples
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Vectorization	Vectorization of the FFT stage
Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r5_stage() [3/4]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE) && detail::is_floating_point_v<Input>)

void aie::fft_dit_r5_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		const Twiddle *__restrict	tw2,
		const Twiddle *__restrict	tw3,
		unsigned	n,
		unsigned	vectorization,
		bool	inv,
		Output *__restrict	out
	)

A function to perform a single floating point radix 5 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1) < w(tw2) < w(tw3)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
tw2	Third twiddle group pointer
tw3	Fourth twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

◆ fft_dit_r5_stage() [4/4]

template<typename Input , typename Output , typename Twiddle >
requires (arch::is(arch::AIE))

void aie::fft_dit_r5_stage	(	const Input *__restrict	x,
		const Twiddle *__restrict	tw0,
		const Twiddle *__restrict	tw1,
		const Twiddle *__restrict	tw2,
		const Twiddle *__restrict	tw3,
		unsigned	n,
		unsigned	vectorization,
		unsigned	shift_tw,
		unsigned	shift,
		bool	inv,
		Output *	out
	)

A function to perform a single radix 5 FFT stage with dynamic vectorization.

This will incur additional overhead compared to the static vectorization implementation.

Defining the rotation rate of a given twiddle to be w(tw), the relationship between the twiddle groups are

w(tw0) < w(tw1) < w(tw2) < w(tw3)

Parameters

x	Input data pointer
tw0	First twiddle group pointer
tw1	Second twiddle group pointer
tw2	Third twiddle group pointer
tw3	Fourth twiddle group pointer
n	Number of samples
vectorization	Vectorization of the FFT stage
shift_tw	Indicates the decimal point of the twiddles (unused for float types)
shift	Shift applied to apply to dit outputs (unused for float types)
inv	Run inverse FFT stage
out	Output data pointer

Template Parameters

Input	Type of the input elements.
Output	Type of the output elements, defaults to input type.
Twiddle	Type of the twiddle elements, defaults to cint16 for integral types and cfloat for floating point.

Overview

Stage-based FFT APIs

Twiddle Generation

Full FFT Example

Typedefs

Functions

Supported Fast Fourier Transform Modes

Typedef Documentation

◆ fft_dit

Function Documentation

◆ fft_dit_r2_stage() [1/4]

◆ fft_dit_r2_stage() [2/4]

◆ fft_dit_r2_stage() [3/4]

◆ fft_dit_r2_stage() [4/4]

◆ fft_dit_r3_stage() [1/4]

◆ fft_dit_r3_stage() [2/4]

◆ fft_dit_r3_stage() [3/4]

◆ fft_dit_r3_stage() [4/4]

◆ fft_dit_r4_stage() [1/2]

◆ fft_dit_r4_stage() [2/2]

◆ fft_dit_r5_stage() [1/4]

◆ fft_dit_r5_stage() [2/4]

◆ fft_dit_r5_stage() [3/4]

◆ fft_dit_r5_stage() [4/4]