AI Engine Intrinsics User Guide (AIE) v2024.2
Loading...
Searching...
No Matches
Load/Store Operations

Overview

The compiler supports pointer dereferencing and pointer arithmetic. No special intrinsics are needed to load or store vectors. For example:

v4cint16 *input, *output;
// ... initialize input and output data pointers
v4cint16 mydata = *input++;
...
*output = mydata;

The data pointers must be aligned to 128-bit boundary for either 128-bit or 256-bit vector loads/stores. Load/Store behaviour is undefined when stack-allocated vector variables are unaligned.

AIE cores are able to perform several vector load/store operations per instruction. However, in order for them to be executed in parallel they must target different memory banks. aiecompiler will try to evenly distribute buffers from communication primitives, and users can manually place buffers on specific banks by specifying the address range in the linker script file. In general the compiler will try to schedule many memory accesses in the same instruction when possible, but there are few exceptions. Memory accesses coming from the same pointer, will be scheduled on different instructions. Also, the compiler provides type annotations to associate memory accesses to virtual resources. Accesses using types that are associated to the same virtual resource will not be scheduled in the same instruction

void func(v4cint16* in1, v4cint16* in2)
{
v4cint16 mydata = *(chess_storage(DM_bankA) in1++;
v4cint16 mydata1 = *(chess_storage(DM_bankA) in2++;
...
}
// type annotations will let compiler know that the two pointers are pointing
// to same virtual resource and hence loads can not be scheduled in parallel.

There are 5 logical/virtual resources A,B,C,D and stack.

#define __aie_dm_resource_a chess_storage(DM_bankA)
#define __aie_dm_resource_b chess_storage(DM_bankB)
#define __aie_dm_resource_c chess_storage(DM_bankC)
#define __aie_dm_resource_d chess_storage(DM_bankD)
#define __aie_dm_resource_stack chess_storage(DM_stack)
// either names can be used in source code to add type annotations on
// pointers.

For user convenience, an enum has been made available to select bank name as a template parameter.

enum class aie_dm_resource {
none,
a,
b,
c,
d,
stack
};
Note
256-bit loads from the same memory bank stalls the processor for one cycle since the memory datapath is 128-bits.
See also
'upd' Intrinsics (Updates) to substitute newly loaded data into a specified lane of larger sized buffers
'srs' Intrinsics (Shift-Round-Saturate) to adjust and move accumulator data into regular variables before storing them in memory

Modules

 Streams