AI Engine Intrinsics User Guide  (AIE) r2p22
 All Data Structures Namespaces Functions Variables Typedefs Groups Pages

Overview

Advanced Floating-point Vector Lane Selection

Select: Selects between the first set of lanes or the second one according to the value in 'select'. If the lane corresponding bit in select is 0 it returns the value in the first set of lanes,otherwise, if it is 1, it returns the value in the second set of lanes.

Shuffle: Shuffle selects from a single input acording to the start/offset computation.

Note
fpsel behaves as a "Shuffle" intrinsic.

To have more information in lane selection please refer to here.

Functions

v16float fpselect16 (unsigned int select, v32float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a floating point selection between lanes of xbuff. More...
 
v16float fpselect16 (unsigned int select, v16float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a floating point selection between lanes of xbuff. More...
 
v16float fpselect16 (unsigned int select, v16float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16float ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a floating point selection between lanes of xbuff and ybuff. More...
 
v8cfloat fpselect8 (unsigned int select, v16cfloat xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Performs a floating point selection between lanes of xbuff. More...
 
v8cfloat fpselect8 (unsigned int select, v8cfloat xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Performs a floating point selection between lanes of xbuff. More...
 
v8cfloat fpselect8 (unsigned int select, v8cfloat xbuff, int xstart, unsigned int xoffsets, v8cfloat ybuff, int ystart, unsigned int yoffsets)
 Performs a floating point selection between lanes of xbuff and ybuff. More...
 
v16float fpshuffle16 (v32float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi)
 Performs a floating point shuffle between lanes of xbuff. More...
 
v16float fpshuffle16 (v16float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi)
 Performs a floating point shuffle between lanes of xbuff. More...
 
v8cfloat fpshuffle8 (v16cfloat xbuff, int xstart, unsigned int xoffsets)
 Performs a floating point shuffle between lanes of xbuff. More...
 
v8cfloat fpshuffle8 (v8cfloat xbuff, int xstart, unsigned int xoffsets)
 Performs a floating point shuffle between lanes of xbuff. More...
 

Function Documentation

v16float fpselect16 ( unsigned int  select,
v32float  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
if (s)
return b;
else
return a;
}
for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v16float Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select unsigned int Value of each bit selects from the value to be placed in the corresponding vector position
xbuff v32float Input buffer of 32 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16float fpselect16 ( unsigned int  select,
v16float  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
if (s)
return b;
else
return a;
}
for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v16float Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select unsigned int Value of each bit selects from the value to be placed in the corresponding vector position
xbuff v16float Input buffer of 16 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16float fpselect16 ( unsigned int  select,
v16float  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16float  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a floating point selection between lanes of xbuff and ybuff.

fpselect(a, b, s)
{
if (s)
return b;
else
return a;
}
for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = fpselect(x[idx], y[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v16float Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select unsigned int Value of each bit selects from the value to be placed in the corresponding vector position
xbuff v16float Input buffer of 16 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuff v16float Input buffer of 16 elements with single precision
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi unsigned int 4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v8cfloat fpselect8 ( unsigned int  select,
v16cfloat  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
if (s)
return b;
else
return a;
}
for (int i = 0; i < 8; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v8cfloat Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select unsigned int Value of each bit selects from the value to be placed in the corresponding vector position
xbuff v16cfloat Input buffer of 16 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 3b (aligned to 4b) offset for each lane in the xbuffer for the second input. LSB apply to first lane
Note
  • When xoffsets or yoffsets is a runtime parameter, it might be more efficient to use a non-complex fpselect instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets The same goes for the select parameter.
  • For more information on how the function f() selects data from the buffers go here.
v8cfloat fpselect8 ( unsigned int  select,
v8cfloat  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
if (s)
return b;
else
return a;
}
for (int i = 0; i < 8; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v8cfloat Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select unsigned int Value of each bit selects from the value to be placed in the corresponding vector position
xbuff v8cfloat Input buffer of 8 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 3b (aligned to 4b) offset for each lane in the xbuffer for the second input. LSB apply to first lane
Note
  • When xoffsets or yoffsets is a runtime parameter, it might be more efficient to use a non-complex fpselect instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets The same goes for the select parameter.
  • For more information on how the function f() selects data from the buffers go here.
v8cfloat fpselect8 ( unsigned int  select,
v8cfloat  xbuff,
int  xstart,
unsigned int  xoffsets,
v8cfloat  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Performs a floating point selection between lanes of xbuff and ybuff.

fpselect(a, b, s)
{
if (s)
return b;
else
return a;
}
for (int i = 0; i < 8; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = fpselect(x[idx], y[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v8cfloat Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select unsigned int Value of each bit selects from the value to be placed in the corresponding vector position
xbuff v8cfloat Input buffer of 8 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
ybuff v8cfloat Input buffer of 8 elements with single precision
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 3b (aligned to 4b) offset for each lane in the ybuffer for the second input. LSB apply to first lane
Note
  • When xoffsets or yoffsets is a runtime parameter, it might be more efficient to use a non-complex fpselect instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets The same goes for the select parameter.
  • For more information on how the function f() selects data from the buffers go here.
v16float fpshuffle16 ( v32float  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi 
)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v16float Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff v32float Input buffer of 32 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16float fpshuffle16 ( v16float  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi 
)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v16float Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff v16float Input buffer of 16 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi unsigned int 4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v8cfloat fpshuffle8 ( v16cfloat  xbuff,
int  xstart,
unsigned int  xoffsets 
)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 8; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v8cfloat Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff v16cfloat Input buffer of 16 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
Note
  • When xoffsets is a runtime parameter, it might be more efficient to use a non-complex fpshuffle instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets
  • For more information on how the function f() selects data from the buffers go here.
v8cfloat fpshuffle8 ( v8cfloat  xbuff,
int  xstart,
unsigned int  xoffsets 
)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 8; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/OutputType Comments
return v8cfloat Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff v8cfloat Input buffer of 8 elements with single precision
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
Note
  • When xoffsets is a runtime parameter, it might be more efficient to use a non-complex fpshuffle instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets
  • For more information on how the function f() selects data from the buffers go here.