AI Engine Intrinsics User Guide  (AIE) r2p22
 All Data Structures Namespaces Functions Variables Typedefs Groups Pages
32-bit Real x 32-bit Real

Overview

32-bit Real self multiplication intrinsics.

Functions

v4acc80 lmac4 (v4acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-accumulate intrinsic function . More...
 
v4acc80 lmac4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-accumulate intrinsic function using small X input buffer. More...
 
v4acc80 lmac4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply-accumulate intrinsic function using small X input buffer. More...
 
v8acc80 lmac8 (v8acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-accumulate intrinsic function . More...
 
v8acc80 lmac8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-accumulate intrinsic function using small X input buffer. More...
 
v8acc80 lmac8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply-accumulate intrinsic function using small X input buffer. More...
 
v4acc80 lmsc4 (v4acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-subtract intrinsic function . More...
 
v4acc80 lmsc4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-subtract intrinsic function using small X input buffer. More...
 
v4acc80 lmsc4 (v4acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply-subtract intrinsic function using small X input buffer. More...
 
v8acc80 lmsc8 (v8acc80 acc, v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-subtract intrinsic function . More...
 
v8acc80 lmsc8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-subtract intrinsic function using small X input buffer. More...
 
v8acc80 lmsc8 (v8acc80 acc, v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply-subtract intrinsic function using small X input buffer. More...
 
v4acc80 lmul4 (v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply intrinsic function . More...
 
v4acc80 lmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply intrinsic function using small X input buffer. More...
 
v4acc80 lmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply intrinsic function using small X input buffer. More...
 
v8acc80 lmul8 (v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply intrinsic function . More...
 
v8acc80 lmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply intrinsic function using small X input buffer. More...
 
v8acc80 lmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply intrinsic function using small X input buffer. More...
 
v4acc80 lnegmul4 (v32int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-negate intrinsic function . More...
 
v4acc80 lnegmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, int ystart, unsigned int yoffsets, int ystep)
 Multiply-negate intrinsic function using small X input buffer. More...
 
v4acc80 lnegmul4 (v16int32 xbuff, int xstart, unsigned int xoffsets, int xstep, v16int32 ybuff, int ystart, unsigned int yoffsets, int ystep)
 Multiply-negate intrinsic function using small X input buffer. More...
 
v8acc80 lnegmul8 (v32int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-negate intrinsic function . More...
 
v8acc80 lnegmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
 Multiply-negate intrinsic function using small X input buffer. More...
 
v8acc80 lnegmul8 (v16int32 xbuff, int xstart, unsigned int xoffsets, v16int32 ybuff, int ystart, unsigned int yoffsets)
 Multiply-negate intrinsic function using small X input buffer. More...
 

Function Documentation

v4acc80 lmac4 ( v4acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-accumulate intrinsic function .

acc0 += x00*y00 + x01*y01
acc1 += x10*y10 + x11*y11
acc2 += x20*y20 + x21*y21
acc3 += x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
acc v4acc80 Incoming accumulation vector (4 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmac4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00 + x01*y01
acc1 += x10*y10 + x11*y11
acc2 += x20*y20 + x21*y21
acc3 += x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
acc v4acc80 Incoming accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmac4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00 + x01*y01
acc1 += x10*y10 + x11*y11
acc2 += x20*y20 + x21*y21
acc3 += x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
acc v4acc80 Incoming accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmac8 ( v8acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-accumulate intrinsic function .

acc0 += x00*y00
acc1 += x10*y10
acc2 += x20*y20
acc3 += x30*y30
acc4 += x40*y40
acc5 += x50*y50
acc6 += x60*y60
acc7 += x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
acc v8acc80 Incoming accumulation vector (8 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmac8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00
acc1 += x10*y10
acc2 += x20*y20
acc3 += x30*y30
acc4 += x40*y40
acc5 += x50*y50
acc6 += x60*y60
acc7 += x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
acc v8acc80 Incoming accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmac8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply-accumulate intrinsic function using small X input buffer.

acc0 += x00*y00
acc1 += x10*y10
acc2 += x20*y20
acc3 += x30*y30
acc4 += x40*y40
acc5 += x50*y50
acc6 += x60*y60
acc7 += x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
acc v8acc80 Incoming accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmsc4 ( v4acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-subtract intrinsic function .

acc0 -= x00*y00 + x01*y01
acc1 -= x10*y10 + x11*y11
acc2 -= x20*y20 + x21*y21
acc3 -= x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
acc v4acc80 Incoming accumulation vector (4 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmsc4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00 + x01*y01
acc1 -= x10*y10 + x11*y11
acc2 -= x20*y20 + x21*y21
acc3 -= x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
acc v4acc80 Incoming accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmsc4 ( v4acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00 + x01*y01
acc1 -= x10*y10 + x11*y11
acc2 -= x20*y20 + x21*y21
acc3 -= x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
acc v4acc80 Incoming accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmsc8 ( v8acc80  acc,
v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-subtract intrinsic function .

acc0 -= x00*y00
acc1 -= x10*y10
acc2 -= x20*y20
acc3 -= x30*y30
acc4 -= x40*y40
acc5 -= x50*y50
acc6 -= x60*y60
acc7 -= x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
acc v8acc80 Incoming accumulation vector (8 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmsc8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00
acc1 -= x10*y10
acc2 -= x20*y20
acc3 -= x30*y30
acc4 -= x40*y40
acc5 -= x50*y50
acc6 -= x60*y60
acc7 -= x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
acc v8acc80 Incoming accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmsc8 ( v8acc80  acc,
v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply-subtract intrinsic function using small X input buffer.

acc0 -= x00*y00
acc1 -= x10*y10
acc2 -= x20*y20
acc3 -= x30*y30
acc4 -= x40*y40
acc5 -= x50*y50
acc6 -= x60*y60
acc7 -= x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
acc v8acc80 Incoming accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmul4 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply intrinsic function .

acc0 = x00*y00 + x01*y01
acc1 = x10*y10 + x11*y11
acc2 = x20*y20 + x21*y21
acc3 = x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00 + x01*y01
acc1 = x10*y10 + x11*y11
acc2 = x20*y20 + x21*y21
acc3 = x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00 + x01*y01
acc1 = x10*y10 + x11*y11
acc2 = x20*y20 + x21*y21
acc3 = x30*y30 + x31*y31

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmul8 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply intrinsic function .

acc0 = x00*y00
acc1 = x10*y10
acc2 = x20*y20
acc3 = x30*y30
acc4 = x40*y40
acc5 = x50*y50
acc6 = x60*y60
acc7 = x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00
acc1 = x10*y10
acc2 = x20*y20
acc3 = x30*y30
acc4 = x40*y40
acc5 = x50*y50
acc6 = x60*y60
acc7 = x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply intrinsic function using small X input buffer.

acc0 = x00*y00
acc1 = x10*y10
acc2 = x20*y20
acc3 = x30*y30
acc4 = x40*y40
acc5 = x50*y50
acc6 = x60*y60
acc7 = x70*y70

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lnegmul4 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-negate intrinsic function .

acc0 = -( x00*y00 + x01*y01 )
acc1 = -( x10*y10 + x11*y11 )
acc2 = -( x20*y20 + x21*y21 )
acc3 = -( x30*y30 + x31*y31 )

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lnegmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 + x01*y01 )
acc1 = -( x10*y10 + x11*y11 )
acc2 = -( x20*y20 + x21*y21 )
acc3 = -( x30*y30 + x31*y31 )

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the xbuffer
Note
  • For more information on how data selection works from the buffers go here.
v4acc80 lnegmul4 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  xstep,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets,
int  ystep 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 + x01*y01 )
acc1 = -( x10*y10 + x11*y11 )
acc2 = -( x20*y20 + x21*y21 )
acc3 = -( x30*y30 + x31*y31 )

Parameters

Input/OutputType Comments
return v4acc80 Returned accumulation vector (4 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
xstep unsigned int Step between each column for selection in the xbuffer
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystep unsigned int Step between each column for selection in the ybuffer
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lnegmul8 ( v32int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-negate intrinsic function .

acc0 = -( x00*y00 )
acc1 = -( x10*y10 )
acc2 = -( x20*y20 )
acc3 = -( x30*y30 )
acc4 = -( x40*y40 )
acc5 = -( x50*y50 )
acc6 = -( x60*y60 )
acc7 = -( x70*y70 )

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
xbuff v32int32Input buffer of 32 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lnegmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
int  ystart,
unsigned int  yoffsets 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 )
acc1 = -( x10*y10 )
acc2 = -( x20*y20 )
acc3 = -( x30*y30 )
acc4 = -( x40*y40 )
acc5 = -( x50*y50 )
acc6 = -( x60*y60 )
acc7 = -( x70*y70 )

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ystart int Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.
v8acc80 lnegmul8 ( v16int32  xbuff,
int  xstart,
unsigned int  xoffsets,
v16int32  ybuff,
int  ystart,
unsigned int  yoffsets 
)

Multiply-negate intrinsic function using small X input buffer.

acc0 = -( x00*y00 )
acc1 = -( x10*y10 )
acc2 = -( x20*y20 )
acc3 = -( x30*y30 )
acc4 = -( x40*y40 )
acc5 = -( x50*y50 )
acc6 = -( x60*y60 )
acc7 = -( x70*y70 )

Parameters

Input/OutputType Comments
return v8acc80 Returned accumulation vector (8 x int80 lanes)
xbuff v16int32Input buffer of 16 elements of type int32
xstart int Starting position offset applied to all lanes of input from X buffer
xoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
ybuff v16int32Right input buffer of 16 elements of type int32
ystart int Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets unsigned int 4b offset for each lane in the xbuffer. LSB apply to first lane
Note
  • For more information on how data selection works from the buffers go here.