AI Engine Intrinsics User Guide  (AIE) v(2024.1)
 All Data Structures Namespaces Functions Variables Typedefs Groups Pages

Overview

Integer vector max/min.

Performs the comparison between lanes (selected using the scheme described in the Advanced Compare page) and returns the result of the comparison (either max or min) as a lane in the output vector.

Functions

v16int32 max16 (v32int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a max comparison between lanes of xbuff.
 
v16int32 max16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a max comparison between lanes of xbuff.
 
v16int32 max16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16int32 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a max comparison between lanes of xbuff and ybuff.
 
v32int16 max32 (v64int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a max comparison between lanes of xbuff.
 
v32int16 max32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a max comparison between lanes of xbuff.
 
v32int16 max32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, v32int16 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a max comparison between lanes of xbuff and ybuff.
 
v16int32 maxcmp16 (v32int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a max comparison between lanes of xbuff.
 
v16int32 maxcmp16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a max comparison between lanes of xbuff.
 
v16int32 maxcmp16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16int32 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a max comparison between lanes of xbuff and ybuff.
 
v32int16 maxcmp32 (v64int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a max comparison between lanes of xbuff.
 
v32int16 maxcmp32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a max comparison between lanes of xbuff.
 
v32int16 maxcmp32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, v32int16 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a max comparison between lanes of xbuff and ybuff.
 
v16int32 min16 (v32int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a min comparison between lanes of xbuff.
 
v16int32 min16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a min comparison between lanes of xbuff.
 
v16int32 min16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16int32 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a min comparison between lanes of xbuff and ybuff.
 
v32int16 min32 (v64int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a min comparison between lanes of xbuff.
 
v32int16 min32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a min comparison between lanes of xbuff.
 
v32int16 min32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, v32int16 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a min comparison between lanes of xbuff and ybuff.
 
v16int32 mincmp16 (v32int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a min comparison between lanes of xbuff.
 
v16int32 mincmp16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a min comparison between lanes of xbuff.
 
v16int32 mincmp16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16int32 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a min comparison between lanes of xbuff and ybuff.
 
v32int16 mincmp32 (v64int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a min comparison between lanes of xbuff.
 
v32int16 mincmp32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a min comparison between lanes of xbuff.
 
v32int16 mincmp32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, v32int16 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a min comparison between lanes of xbuff and ybuff.
 

Function Documentation

v16int32 max16 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 max16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 max16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a max comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = max(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuffInput buffer of 16 elements with 32-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v32int16 max32 ( v64int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 64 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 max32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 max32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
v32int16  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a max comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = max(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ybuffInput buffer of 32 elements with 16-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v16int32 maxcmp16 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 maxcmp16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 maxcmp16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a max comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = max(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > y[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuffInput buffer of 16 elements with 32-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v32int16 maxcmp32 ( v64int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 64 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 maxcmp32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a max comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = max(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 maxcmp32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
v32int16  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a max comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = max(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > y[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a max comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ybuffInput buffer of 32 elements with 16-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v16int32 min16 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 min16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 min16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a min comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = min(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuffInput buffer of 16 elements with 32-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
v32int16 min32 ( v64int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 64 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 min32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 min32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
v32int16  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a min comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = min(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ybuffInput buffer of 32 elements with 16-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v16int32 mincmp16 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] <= x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 mincmp16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] <= x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 mincmp16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a min comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = min(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] <= y[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuffInput buffer of 16 elements with 32-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v32int16 mincmp32 ( v64int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] <= x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 64 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 mincmp32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a min comparison between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = min(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] <= x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 mincmp32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
v32int16  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a min comparison between lanes of xbuff and ybuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = min(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] <= y[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a min comparison between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ybuffInput buffer of 32 elements with 16-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.