AI Engine API User Guide (AIE) 2022.1
aie::lut< ParallelAccesses, OffsetType, SlopeType > Struct Template Reference

Detailed Description

template<unsigned ParallelAccesses, typename OffsetType, typename SlopeType = OffsetType>
struct aie::lut< ParallelAccesses, OffsetType, SlopeType >

Abstraction to represent a LUT that is stored in memory, instantiated with pointer(s) to the already populated memory and the number of elements. To achieve a given level of access parallelism, the LUT values need to have a specific layout in memory:

  • To achieve the 2 loads in parallel, the LUT needs to have 2 copies of the desired LUT values with repetition every 128b. For example with 32b values, in memory we would have the first 4 values (128b), then the same 4 again, then the next 4, which then repeat, etc.
  • For 4 loads in parallel, we require the same layout as above, but two distinct copies like this, each placed in different banks. This is in the end 4 total copies of the LUT values in memory.
  • For a single load without parallelism, the values are just stored linearly in memory.

Currently the only supported implementation is 4 parallel accesses on this architecture.

Template Parameters
ParallelAccessesDefines how many parallel accesses will be done in a single LUT access, possibilities depend on the hardware available for the given architecture
OffsetTypeType of values stored within the lookup table.
SlopeTypeOptional template parameter, only needed in certain cases of linear approximation where the offset/slope value pair uses two different types.

#include <aie.hpp>

Inheritance diagram for aie::lut< ParallelAccesses, OffsetType, SlopeType >:
aie::detail::lut< ParallelAccesses, OffsetType, OffsetType >

Public Types

using lut_impl = detail::lut< ParallelAccesses, OffsetType, SlopeType >
 
using offset_type = OffsetType
 
using slope_type = SlopeType
 

Public Member Functions

 lut (unsigned LUT_elems, const void *LUT_a)
  More...
 
 lut (unsigned LUT_elems, const void *LUT_ab)
  More...
 
 lut (unsigned LUT_elems, const void *LUT_ab, const void *LUT_cd)
  More...
 

Constructor & Destructor Documentation

◆ lut() [1/3]

template<unsigned ParallelAccesses, typename OffsetType , typename SlopeType = OffsetType>
aie::lut< ParallelAccesses, OffsetType, SlopeType >::lut ( unsigned  LUT_elems,
const void *  LUT_ab,
const void *  LUT_cd 
)
inline

Constructor for 4 parallel accesses. Each pointer points to an equivalent LUT populated within which the values are repeated twice, interleaved at a 128b granularity. In total the same values need to be present 4 times in memory to allow for the 4 parallel accesses.

Parameters
LUT_elemsNumber elements in the LUT (not accounting for repetition).
LUT_abFirst two copies of the data, with the values repeated and interleaved at a 128b granularity.
LUT_cdNext two copies of the data, with the values repeated and interleaved at a 128b granularity.

◆ lut() [2/3]

template<unsigned ParallelAccesses, typename OffsetType , typename SlopeType = OffsetType>
aie::lut< ParallelAccesses, OffsetType, SlopeType >::lut ( unsigned  LUT_elems,
const void *  LUT_ab 
)
inline

Constructor for two parallel accesses.

Parameters
LUT_elemsNumber of elements in the LUT (not accounting for repetition).
LUT_abTwo copies of the data, with the values interleaved at a 128b granularity.

◆ lut() [3/3]

template<unsigned ParallelAccesses, typename OffsetType , typename SlopeType = OffsetType>
aie::lut< ParallelAccesses, OffsetType, SlopeType >::lut ( unsigned  LUT_elems,
const void *  LUT_a 
)
inline

Constructor for singular access.

Parameters
LUT_elemsNumber of elements in the LUT.
LUT_aPointer to the LUT values.

The documentation for this struct was generated from the following file: