AI Engine Intrinsics User Guide  (v2023.2)
 All Data Structures Namespaces Functions Variables Typedefs Groups Pages
Vector MaxDiff

Overview

Vector maxdiff.

Performs the integer subtraction between the lanes of X/Y (selected using the scheme described in the Advanced Compare page) and returns the maximum between zero and the subtraction result as a lane in the output vector.

Functions

v16int32 maxdiff16 (v32int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a maximum difference computation between lanes of xbuff.
 
v16int32 maxdiff16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a maximum difference computation between lanes of xbuff.
 
v16int32 maxdiff16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16int32 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
 Performs a maximum difference computation between lanes of xbuff and ybuff.
 
v32int16 maxdiff32 (v64int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a maximum difference computation between lanes of xbuff.
 
v32int16 maxdiff32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a maximum difference computation between lanes of xbuff.
 
v32int16 maxdiff32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, v32int16 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare)
 Performs a maximum difference computation between lanes of xbuff and ybuff.
 
v16int32 maxdiffcmp16 (v32int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a maximum difference computation between lanes of xbuff.
 
v16int32 maxdiffcmp16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a maximum difference computation between lanes of xbuff.
 
v16int32 maxdiffcmp16 (v16int32 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16int32 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int &cmp)
 Performs a maximum difference computation between lanes of xbuff and ybuff.
 
v32int16 maxdiffcmp32 (v64int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a maximum difference computation between lanes of xbuff.
 
v32int16 maxdiffcmp32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a maximum difference computation between lanes of xbuff.
 
v32int16 maxdiffcmp32 (v32int16 xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, unsigned int xsquare, v32int16 ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi, unsigned int ysquare, unsigned int &cmp)
 Performs a maximum difference computation between lanes of xbuff and ybuff.
 

Function Documentation

v16int32 maxdiff16 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a maximum difference computation between lanes of xbuff.

maxdiff(a, b)
{
d = a - b;
return max(d, 0);
}
for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
  • Data from xbuff using 'xstart','xoffsets(_hi)' params is the left operand and data from xbuff using 'ystart','yoffsets(_hi)' params is the right operand.
v16int32 maxdiff16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a maximum difference computation between lanes of xbuff.

maxdiff(a, b)
{
d = a - b;
return max(d, 0);
}
for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
  • Data from xbuff using 'xstart','xoffsets(_hi)' params is the left operand and data from xbuff using 'ystart','yoffsets(_hi)' params is the right operand.
v16int32 maxdiff16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi 
)

Performs a maximum difference computation between lanes of xbuff and ybuff.

maxdiff(a, b)
{
d = a - b;
return max(d, 0);
}
for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = maxdiff(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuffInput buffer of 16 elements with 32-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
Note
  • For more information on how the function f() selects data from the buffers go here.
  • Data from xbuff is the left operand and data from ybuff is the right operand.
v32int16 maxdiff32 ( v64int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a maximum difference computation between lanes of xbuff.

maxdiff(a, b)
{
d = a - b;
return max(d, 0);
}
for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 64 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
  • Data from xbuff using 'xstart','xoffsets(_hi)' params is the left operand and data from xbuff using 'ystart','yoffsets(_hi)' params is the right operand.
v32int16 maxdiff32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a maximum difference computation between lanes of xbuff.

maxdiff(a, b)
{
d = a - b;
return max(d, 0);
}
for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
  • Data from xbuff using 'xstart','xoffsets(_hi)' params is the left operand and data from xbuff using 'ystart','yoffsets(_hi)' params is the right operand.
v32int16 maxdiff32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
v32int16  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare 
)

Performs a maximum difference computation between lanes of xbuff and ybuff.

maxdiff(a, b)
{
d = a - b;
return max(d, 0);
}
for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = maxdiff(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ybuffInput buffer of 32 elements with 16-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
  • Data from xbuff is the left operand and data from ybuff is the right operand.
v16int32 maxdiffcmp16 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a maximum difference computation between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 maxdiffcmp16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a maximum difference computation between lanes of xbuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v16int32 maxdiffcmp16 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int &  cmp 
)

Performs a maximum difference computation between lanes of xbuff and ybuff.

for (int i = 0; i < 16; i++)
idx = f( xstart, xoffsets[i]);
idy = f( ystart, yoffsets[i]);
o[i] = maxdiff(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > y[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 16 elements with 32-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuffInput buffer of 16 elements with 32-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • For more information on how the function f() selects data from the buffers go here.
v32int16 maxdiffcmp32 ( v64int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a maximum difference computation between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 64 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 maxdiffcmp32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a maximum difference computation between lanes of xbuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = maxdiff(x[idx], x[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > x[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.
v32int16 maxdiffcmp32 ( v32int16  xbuff,
int  xstart,
unsigned int  xoffsets,
unsigned int  xoffsets_hi,
unsigned int  xsquare,
v32int16  ybuff,
int  ystart,
unsigned int  yoffsets,
unsigned int  yoffsets_hi,
unsigned int  ysquare,
unsigned int &  cmp 
)

Performs a maximum difference computation between lanes of xbuff and ybuff.

for (int i = 0; i < 32; i++)
idx = f( xstart, xoffsets[i],xsquare);
idy = f( ystart, yoffsets[i],ysquare);
o[i] = maxdiff(x[idx], y[idy])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0],xsquare)
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7],xsquare)
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.
cmp[i-th bit] = ( x[idx] > y[idy] ? 1 : 0 );
Returns
Value of each lane is the result of a maximum difference computation between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
Parameters
xbuffInput buffer of 32 elements with 16-bit precision
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi4b offset for each lane, applied to the xbuffer. LSB apply to 16th lane
xsquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs to be less than 4. max value for this field is (0x3333)
ybuffInput buffer of 32 elements with 16-bit precision
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi4b offset for each lane, applied to the ybuffer. LSB apply to 16th lane
ysquareSelect order of the mini-permute square (default=0x3210). LSB apply to first element. Value per lane needs be less than 4. max value for this field is (0x3333)
cmp32bit value where each bit is the results of the comparison lane by lane, referred to the output lanes
Note
  • This intrinsic uses the 'square' parameter, to have more information on how to use this please go here
  • For more information on how the function f() selects data from the buffers go here.