AI Engine Intrinsics User Guide
(AIE) v(2024.1)
|
Intrinsic that returns its input after 6 clock cycles. Used for scheduling optimization. More...
Intrinsic that returns its input after 6 clock cycles. Used for scheduling optimization.
Intrinsic that returns its input after 8 clock cycles. Used for scheduling optimization.
This group of intrinsics allows to issue a move between two scalar registers delayed by some hardcoded value.
Usage example:
Inside a given loop there is the computation of a value and then use this to load another value from memory. The first value and the newly loaded value should be processed together afterwards. To do this, the first value needs to be alive for at least the latency of the load (7 cycles). This can create long resource dependencies and a bad schedule.
The delay function can be used to hold the first value (inside the pipeline registers). This will ease the problem to the compiler because, while the register is in the delay pipeline, it can reuse the original register to perform a different operation. With this, you can get to a nicely scheduled loop.
Functions | |
int | delay1 (int) |
int | delay2 (int) |
int delay1 | ( | int | ) |
int delay2 | ( | int | ) |