AI Engine API User Guide (AIE) 2022.1
|
AIE provides hardware support to accelerate special multiplications that can be used to accelerate specific application use cases like (but not limited to) signal processing.
Typedefs | |
template<unsigned Lanes, unsigned Points, int CoeffStep, int DataStepX, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>> | |
using | aie::sliding_mul_sym_x_ops = sliding_mul_sym_ops< Lanes, Points, CoeffStep, DataStepX, 1, CoeffType, DataType, AccumTag > |
More... | |
template<unsigned Lanes, unsigned Points, int CoeffStep, int DataStepXY, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>> | |
using | aie::sliding_mul_sym_xy_ops = sliding_mul_sym_ops< Lanes, Points, CoeffStep, DataStepXY, DataStepXY, CoeffType, DataType, AccumTag > |
More... | |
template<unsigned Lanes, unsigned Points, int CoeffStep, int DataStepY, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>> | |
using | aie::sliding_mul_sym_y_ops = sliding_mul_sym_ops< Lanes, Points, CoeffStep, 1, DataStepY, CoeffType, DataType, AccumTag > |
More... | |
template<unsigned Lanes, unsigned Points, int CoeffStep, int DataStepX, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>> | |
using | aie::sliding_mul_x_ops = sliding_mul_ops< Lanes, Points, CoeffStep, DataStepX, 1, CoeffType, DataType, AccumTag > |
More... | |
template<unsigned Lanes, unsigned Points, int CoeffStep, int DataStepXY, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>> | |
using | aie::sliding_mul_xy_ops = sliding_mul_ops< Lanes, Points, CoeffStep, DataStepXY, DataStepXY, CoeffType, DataType, AccumTag > |
More... | |
template<unsigned Lanes, unsigned Points, int CoeffStep, int DataStepY, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>> | |
using | aie::sliding_mul_y_ops = sliding_mul_ops< Lanes, Points, CoeffStep, 1, DataStepY, CoeffType, DataType, AccumTag > |
More... | |
Functions | |
template<unsigned Lanes, AccumOrOp Acc, Vector VecCoeff = void, Vector VecData = void, Vector... NextVecData> | |
auto | aie::accumulate (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, const NextVecData &...next_data) -> operand_base_type_t< Acc > |
More... | |
template<unsigned Lanes, AccumElemBaseType AccumTag = accauto, Vector VecCoeff = void, Vector VecData = void, Vector... NextVecData> | |
auto | aie::accumulate (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, const NextVecData &...next_data) -> accum< std::conditional_t< std::is_same_v< AccumTag, accauto >, detail::default_accum_tag_t< typename VecCoeff::value_type, typename VecData::value_type >, AccumTag >, Lanes > |
More... | |
struct aie::sliding_mul_ops |
This type provides a parametrized multiplication that implements the following compute pattern:
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
16b x 16b | 8 16 | 1,2,3,4 | 1 | 1 | Unsigned smaller than 16 | Signed |
16b x 32b | 8 16 | 1,2,3,4 | 1,2,3,4 | 1,2 1 | Unsigned smaller than 16 | Signed |
32b x 16b | 8 16 | 1,2,3,4 | 1,2,3,4 | 1,2 1 | Unsigned smaller than 16 | Signed |
16b x c16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c16b x 16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c16b x c16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c16b x 32b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
32b x c16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c32b x 16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
16b x c32b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
32b x 16b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 16 | Signed |
16b x 32b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 16 | Signed |
32b x 32b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c16b x 32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x 16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x 32b | 2 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
32b x c32b | 2 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x c32b | 2 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
float x float | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 16 | Signed |
float x cfloat | 4 | 1,2,3 | 1,2,3,4 | 1,2,3 | Unsigned smaller than 16 | Signed |
cfloat x float | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3 | Unsigned smaller than 16 | Signed |
cfloat x cfloat | 4 | 1,2,3 | 1,2,3,4 | 1,2,3 | Unsigned smaller than 16 | Signed |
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection accross lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
Public Types | |
using | accum_type = accum< std::conditional_t< std::is_same_v< AccumTag, accauto >, detail::default_accum_tag_t< CoeffType, DataType >, AccumTag >, Lanes > |
using | coeff_type = typename impl_type::coeff_type |
using | data_type = typename impl_type::data_type |
using | impl_type = detail::sliding_mul< Lanes, Points, CoeffStep, DataStepX, DataStepY, accum_bits, CoeffType, DataType > |
enum class | MulType { Mul , Acc_Mul , NegMul } |
Static Public Member Functions | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
template<MulType Mul, VectorOrOp VecCoeff, VectorOrOp VecData, AccumOrOp... Acc> | |
static constexpr accum_type | mul_common (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, const Acc &...acc) |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | negmul (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
Static Public Attributes | |
static constexpr unsigned | columns_per_mul = impl_type::columns_per_mul |
static constexpr unsigned | lanes = impl_type::lanes |
static constexpr unsigned | lanes_per_mul = impl_type::lanes_per_mul |
static constexpr unsigned | num_mul = impl_type::num_mul |
static constexpr unsigned | points = impl_type::points |
|
strong |
|
inlinestaticconstexpr |
Performs a multiply-add with the pattern defined by the class parameters using the input coefficient and data arguments.
acc | Accumulator that is added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
data_start | Index of the first data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the multiplication pattern defined by the class parameters using the input coefficient and data arguments.
|
inlinestaticconstexpr |
Performs a negation of the multiplication pattern defined by the class parameters using the input coefficient and data arguments.
struct aie::sliding_mul_sym_ops |
This type provides a parametrized multiplication that implements the following compute pattern:
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
16b x 16b | 8 16 | 1,2,3,4 | 1 | 1 | Unsigned smaller than 16 | Signed |
16b x 32b | 8 16 | 1,2,3,4 | 1,2,3,4 | 1,2 1 | Unsigned smaller than 16 | Signed |
32b x 16b | 8 16 | 1,2,3,4 | 1,2,3,4 | 1,2 1 | Unsigned smaller than 16 | Signed |
16b x c16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c16b x 16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c16b x c16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c16b x 32b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
32b x c16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c32b x 16b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
16b x c32b | 4 8 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
32b x 16b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 16 | Signed |
16b x 32b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 16 | Signed |
32b x 32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 | Unsigned smaller than 16 | Signed |
32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c16b x 32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x 16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x 32b | 2 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
32b x c32b | 2 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x c32b | 2 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
Public Types | |
using | accum_type = accum< std::conditional_t< std::is_same_v< AccumTag, accauto >, detail::default_accum_tag_t< CoeffType, DataType >, AccumTag >, Lanes > |
using | coeff_type = typename impl_type::coeff_type |
using | data_type = typename impl_type::data_type |
using | impl_type = detail::sliding_mul_sym< Lanes, Points, CoeffStep, DataStepX, DataStepY, accum_bits, CoeffType, DataType > |
enum class | SymMulType { Sym , Antisym , Acc_Sym , Acc_Antisym } |
Static Public Member Functions | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_antisym (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_antisym (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_antisym (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_sym (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_sym (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_sym (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... | |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_antisym (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_antisym (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... | |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_antisym (const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... | |
template<SymMulType MulType, VectorOrOp VecCoeff, VectorOrOp VecData, AccumOrOp... Acc> | |
static constexpr accum_type | mul_common (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, const Acc &...acc) |
template<SymMulType MulType, VectorOrOp VecCoeff, VectorOrOp VecData, AccumOrOp... Acc> | |
static constexpr accum_type | mul_common (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start, const Acc &...acc) |
template<SymMulType MulType, VectorOrOp VecCoeff, VectorOrOp VecData, AccumOrOp... Acc> | |
static constexpr accum_type | mul_common (const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, const Acc &...acc) |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_sym (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... | |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_sym (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... | |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_sym (const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... | |
Static Public Attributes | |
static constexpr unsigned | columns_per_mul = impl_type::columns_per_mul |
static constexpr unsigned | lanes = impl_type::lanes |
static constexpr unsigned | lanes_per_mul = impl_type::lanes_per_mul |
static constexpr unsigned | num_mul = impl_type::num_mul |
static constexpr unsigned | points = impl_type::points |
|
strong |
|
inlinestaticconstexpr |
Performs the antisymmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
data_start | Index of the first data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vector of left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vector of right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
data_start | Index of the first data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vector of left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vector of right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
|
inlinestaticconstexpr |
Performs the antisymmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vector of left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vector of right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments.
|
inlinestaticconstexpr |
Performs the symmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vector of data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
coeff | Vector of coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vector of left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vector of right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
struct aie::sliding_mul_sym_uct_ops |
This type provides a parametrized multiplication across the lower half of its lanes (equivalent to sliding_mul_sym_ops), and upshifts one selected set of data in the upper half of the lanes.
It implements the following compute pattern:
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
c16b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 16 | Signed |
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane in the first half of the output Lanes. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStep | Step used to select elements from the data buffer. This step is applied to element selection within a lane and across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
Public Types | |
using | accum_type = accum< std::conditional_t< std::is_same_v< AccumTag, accauto >, detail::default_accum_tag_t< CoeffType, DataType >, AccumTag >, Lanes > |
using | coeff_type = typename impl_type::coeff_type |
using | data_type = typename impl_type::data_type |
using | impl_type = detail::sliding_mul_sym_uct< Lanes, Points, CoeffStep, DataStep, accum_bits, CoeffType, DataType > |
enum class | SymMulType { Sym , Antisym , Acc_Sym , Acc_Antisym } |
Static Public Member Functions | |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_antisym_uct (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_antisym_uct (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_sym_uct (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<AccumOrOp Acc, VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mac_sym_uct (const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_antisym_uct (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_antisym_uct (const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
template<SymMulType MulType, VectorOrOp VecCoeff, VectorOrOp VecData, AccumOrOp... Acc> | |
static constexpr accum_type | mul_common (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift, const Acc &...acc) |
template<SymMulType MulType, VectorOrOp VecCoeff, VectorOrOp VecData, AccumOrOp... Acc> | |
static constexpr accum_type | mul_common (const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift, const Acc &...acc) |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_sym_uct (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<VectorOrOp VecCoeff, VectorOrOp VecData> | |
static constexpr accum_type | mul_sym_uct (const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
Static Public Attributes | |
static constexpr unsigned | columns_per_mul = impl_type::columns_per_mul |
static constexpr unsigned | lanes = impl_type::lanes |
static constexpr unsigned | lanes_per_mul = impl_type::lanes_per_mul |
static constexpr unsigned | num_mul = impl_type::num_mul |
static constexpr unsigned | points = impl_type::points |
|
strong |
using aie::sliding_mul_sym_x_ops = typedef sliding_mul_sym_ops<Lanes, Points, CoeffStep, DataStepX, 1, CoeffType, DataType, AccumTag> |
Similar to sliding_mul_sym_ops, but DataStepY is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
using aie::sliding_mul_sym_xy_ops = typedef sliding_mul_sym_ops<Lanes, Points, CoeffStep, DataStepXY, DataStepXY, CoeffType, DataType, AccumTag> |
Similar to sliding_mul_sym_ops, but DataStepX is equal to DataStepY.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepXY | Step used to select elements from the data buffer. This step is applied to element selection within a lane and across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
using aie::sliding_mul_sym_y_ops = typedef sliding_mul_sym_ops<Lanes, Points, CoeffStep, 1, DataStepY, CoeffType, DataType, AccumTag> |
Similar to sliding_mul_sym_ops, but DataStepX is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
using aie::sliding_mul_x_ops = typedef sliding_mul_ops<Lanes, Points, CoeffStep, DataStepX, 1, CoeffType, DataType, AccumTag> |
Similar to sliding_mul_ops, but DataStepY is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
using aie::sliding_mul_xy_ops = typedef sliding_mul_ops<Lanes, Points, CoeffStep, DataStepXY, DataStepXY, CoeffType, DataType, AccumTag> |
Similar to sliding_mul_ops, but DataStepX is equal to DataStepY.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepXY | Step used to select elements from the data buffer. This step is applied to element selection within a lane and across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
using aie::sliding_mul_y_ops = typedef sliding_mul_ops<Lanes, Points, CoeffStep, 1, DataStepY, CoeffType, DataType, AccumTag> |
Similar to sliding_mul_ops, but DataStepX is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection accross lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
auto aie::accumulate | ( | const Acc & | acc, |
const VecCoeff & | coeff, | ||
unsigned | coeff_start, | ||
const VecData & | data, | ||
const NextVecData &... | next_data | ||
) | -> operand_base_type_t<Acc> |
This function provides a parametrized multiplication that implements the following compute pattern:
Lanes | Number of output elements. |
auto aie::accumulate | ( | const VecCoeff & | coeff, |
unsigned | coeff_start, | ||
const VecData & | data, | ||
const NextVecData &... | next_data | ||
) | -> accum<std::conditional_t<std::is_same_v<AccumTag, accauto>, detail::default_accum_tag_t<typename VecCoeff::value_type, typename VecData::value_type>, AccumTag>, Lanes> |
This function provides a parametrized multiplication that implements the following compute pattern:
Lanes | Number of output elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
coeff | Vector with the coefficients. |
coeff_start | First element from the coeff vector to be used. |
data | First vector of data. |
next_data | Rest of the data vectors. |