AI Engine Intrinsics User Guide  (v2023.2)
 All Data Structures Namespaces Functions Variables Typedefs Groups Pages
32-bit Real x 32-bit Real

Overview

32-bit Real self multiplication intrinsics.

Functions

v4acc80 lmac4 (v4acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-accumulate intrinsic function .
 
v4acc80 lmac4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-accumulate intrinsic function using small X input buffer.
 
v4acc80 lmac4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply-accumulate intrinsic function using small X input buffer.
 
v8acc80 lmac8 (v8acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-accumulate intrinsic function .
 
v8acc80 lmac8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-accumulate intrinsic function using small X input buffer.
 
v8acc80 lmac8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply-accumulate intrinsic function using small X input buffer.
 
v4acc80 lmsc4 (v4acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-subtract intrinsic function .
 
v4acc80 lmsc4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-subtract intrinsic function using small X input buffer.
 
v4acc80 lmsc4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply-subtract intrinsic function using small X input buffer.
 
v8acc80 lmsc8 (v8acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-subtract intrinsic function .
 
v8acc80 lmsc8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-subtract intrinsic function using small X input buffer.
 
v8acc80 lmsc8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply-subtract intrinsic function using small X input buffer.
 
v4acc80 lmul4 (v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply intrinsic function .
 
v4acc80 lmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply intrinsic function using small X input buffer.
 
v4acc80 lmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply intrinsic function using small X input buffer.
 
v8acc80 lmul8 (v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply intrinsic function .
 
v8acc80 lmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply intrinsic function using small X input buffer.
 
v8acc80 lmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply intrinsic function using small X input buffer.
 
v4acc80 lnegmul4 (v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-negate intrinsic function .
 
v4acc80 lnegmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-negate intrinsic function using small X input buffer.
 
v4acc80 lnegmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply-negate intrinsic function using small X input buffer.
 
v8acc80 lnegmul8 (v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-negate intrinsic function .
 
v8acc80 lnegmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-negate intrinsic function using small X input buffer.
 
v8acc80 lnegmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply-negate intrinsic function using small X input buffer.
 

Function Documentation

v4acc80 lmac4 ( v4acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-accumulate intrinsic function .

acc0 += x00*y00 + x01*y01
acc1 += x10*y10 + x11*y11
acc2 += x20*y20 + x21*y21
acc3 += x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
accIncoming accumulation vector (4 x int80 lanes)
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmac4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00 + x01*y01
acc1 += x10*y10 + x11*y11
acc2 += x20*y20 + x21*y21
acc3 += x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
accIncoming accumulation vector (4 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmac4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00 + x01*y01
acc1 += x10*y10 + x11*y11
acc2 += x20*y20 + x21*y21
acc3 += x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
accIncoming accumulation vector (4 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmac8 ( v8acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-accumulate intrinsic function .

acc0 += x00*y00
acc1 += x10*y10
acc2 += x20*y20
acc3 += x30*y30
acc4 += x40*y40
acc5 += x50*y50
acc6 += x60*y60
acc7 += x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
accIncoming accumulation vector (8 x int80 lanes)
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmac8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00
acc1 += x10*y10
acc2 += x20*y20
acc3 += x30*y30
acc4 += x40*y40
acc5 += x50*y50
acc6 += x60*y60
acc7 += x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
accIncoming accumulation vector (8 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmac8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00
acc1 += x10*y10
acc2 += x20*y20
acc3 += x30*y30
acc4 += x40*y40
acc5 += x50*y50
acc6 += x60*y60
acc7 += x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
accIncoming accumulation vector (8 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmsc4 ( v4acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-subtract intrinsic function .

acc0 -= x00*y00 + x01*y01
acc1 -= x10*y10 + x11*y11
acc2 -= x20*y20 + x21*y21
acc3 -= x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
accIncoming accumulation vector (4 x int80 lanes)
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmsc4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00 + x01*y01
acc1 -= x10*y10 + x11*y11
acc2 -= x20*y20 + x21*y21
acc3 -= x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
accIncoming accumulation vector (4 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmsc4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00 + x01*y01
acc1 -= x10*y10 + x11*y11
acc2 -= x20*y20 + x21*y21
acc3 -= x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
accIncoming accumulation vector (4 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmsc8 ( v8acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-subtract intrinsic function .

acc0 -= x00*y00
acc1 -= x10*y10
acc2 -= x20*y20
acc3 -= x30*y30
acc4 -= x40*y40
acc5 -= x50*y50
acc6 -= x60*y60
acc7 -= x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
accIncoming accumulation vector (8 x int80 lanes)
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmsc8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00
acc1 -= x10*y10
acc2 -= x20*y20
acc3 -= x30*y30
acc4 -= x40*y40
acc5 -= x50*y50
acc6 -= x60*y60
acc7 -= x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
accIncoming accumulation vector (8 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmsc8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00
acc1 -= x10*y10
acc2 -= x20*y20
acc3 -= x30*y30
acc4 -= x40*y40
acc5 -= x50*y50
acc6 -= x60*y60
acc7 -= x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
accIncoming accumulation vector (8 x int80 lanes)
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmul4 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply intrinsic function .

acc0 = x00*y00 + x01*y01
acc1 = x10*y10 + x11*y11
acc2 = x20*y20 + x21*y21
acc3 = x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00 + x01*y01
acc1 = x10*y10 + x11*y11
acc2 = x20*y20 + x21*y21
acc3 = x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00 + x01*y01
acc1 = x10*y10 + x11*y11
acc2 = x20*y20 + x21*y21
acc3 = x30*y30 + x31*y31
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmul8 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply intrinsic function .

acc0 = x00*y00
acc1 = x10*y10
acc2 = x20*y20
acc3 = x30*y30
acc4 = x40*y40
acc5 = x50*y50
acc6 = x60*y60
acc7 = x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00
acc1 = x10*y10
acc2 = x20*y20
acc3 = x30*y30
acc4 = x40*y40
acc5 = x50*y50
acc6 = x60*y60
acc7 = x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00
acc1 = x10*y10
acc2 = x20*y20
acc3 = x30*y30
acc4 = x40*y40
acc5 = x50*y50
acc6 = x60*y60
acc7 = x70*y70
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lnegmul4 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-negate intrinsic function .

acc0 = -( x00*y00 + x01*y01 )
acc1 = -( x10*y10 + x11*y11 )
acc2 = -( x20*y20 + x21*y21 )
acc3 = -( x30*y30 + x31*y31 )
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lnegmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 + x01*y01 )
acc1 = -( x10*y10 + x11*y11 )
acc2 = -( x20*y20 + x21*y21 )
acc3 = -( x30*y30 + x31*y31 )
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lnegmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 + x01*y01 )
acc1 = -( x10*y10 + x11*y11 )
acc2 = -( x20*y20 + x21*y21 )
acc3 = -( x30*y30 + x31*y31 )
Returns
Returned accumulation vector (4 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
xstepStep between each column for selection in the xbuffer
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystepStep between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lnegmul8 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-negate intrinsic function .

acc0 = -( x00*y00 )
acc1 = -( x10*y10 )
acc2 = -( x20*y20 )
acc3 = -( x30*y30 )
acc4 = -( x40*y40 )
acc5 = -( x50*y50 )
acc6 = -( x60*y60 )
acc7 = -( x70*y70 )
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
xbuffInput buffer of 32 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lnegmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 )
acc1 = -( x10*y10 )
acc2 = -( x20*y20 )
acc3 = -( x30*y30 )
acc4 = -( x40*y40 )
acc5 = -( x50*y50 )
acc6 = -( x60*y60 )
acc7 = -( x70*y70 )
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ystartStarting position offset applied to all lanes of input from xbuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lnegmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 )
acc1 = -( x10*y10 )
acc2 = -( x20*y20 )
acc3 = -( x30*y30 )
acc4 = -( x40*y40 )
acc5 = -( x50*y50 )
acc6 = -( x60*y60 )
acc7 = -( x70*y70 )
Returns
Returned accumulation vector (8 x int80 lanes)
Parameters
xbuffInput buffer of 16 elements of type int32
xstartStarting position offset applied to all lanes of input from X buffer
xoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
ybuffRight input buffer of 16 elements of type int32
ystartStarting position offset applied to all lanes of input from ybuffer for the second input
yoffsets4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.