These intrinsics are designed to facilitate iteration over datastructures in a matrix-like way, with the possibility to iterate over your datastructure several times.
- 1D is when you iterate over a contiguous place in memory (over a 1D array for example), forming a sequence (for example, in an array you iterate over X elements).
- 2D is when you repeat this sequence a certain number of times in a regular way (in this same array each time you iterate over X elements, you then jump 4 elements further and you iterate over X elements again, repeating this sequence Y times).
- 3D is when you repeat a 2D sequence in a regular way (every time you finish the 2D sequence, you jump 42 elements further and you repeat the sequence, Z times in total) These intrinsics allows you to iterate on your memory following 2D or 3D sequences in a customized way and with 1 call, without spending too much time on making loops and making your program more complex.
add_2d / add_3d example
The following pseudo-code shows the inner working of the intrinsics:
void* add_2d(void* a, int off, int size1, int &count1, int inc1) {
if (count1_in >= size1) {
count1 = 0;
return a + off;
}
else {
count1 = count1 + 1;
return a + inc1;
}
}
void* add_3d(void* a, int off, int size1, int &count1, int inc1, int size2, int &count2, int inc2) {
if (count1_in >= size1) {
if (count2 >= size2) {
count1 = 0;
count2 = 0;
return a + off;
}
else {
count1 = 0;
count2 = count2 +1;
return a + inc2;
}
}
else {
count1 = count1 + 1;
count2 = count2;
return a + inc1;
}
}
add_2d example
add_2d is designed to iterate 1 time over your data structure. You can do the same with add_3d, but add_2d uses less registers in memory. Here is an example of iterating over a matrix. A matrix:
int A[2*2];
int* iterateA = A;
int off = 1;
int inc = 1;
int size = 1;
int count = 0;
for (int i = 0; i < 2*2; i++) {
iterateA = add_2d(iterateA, off, inc, count, size);
}
add_3d example
add_3d can be very useful for large and complex operations using matrix multiplication, where you have to iterate several times over your datastructure. This example shows a simple 2x4 * 4x2 matrix multiplication:
B matrix :
A matrix:
C matrix (result):
int[2*4] B;
int[4*2] A;
int[2*2] C;
int* iterateA = A;
int* iterateB = B;
The B matrix is represented by a contiguous array (row by row) in memory. As we want to iterate over the B matrix following the columns and not the rows, the first increment (incB1) is by 2 (so that we can go from B0 to B2, B4,..etc)
Same way, the second increment (incB2) is negative (-5), so that once we finished iterating over the first column (B0->B2->B4->B6), we want to iterate over the second column, starting at B1, so the second increment must be -5.
sizeB1 is 3 as once we are on the first element of a sequence there is still 3 elements to jump on.
sizeB2 is 1 as once we are on the first column there are 1 other to jump on.
offB is -7 as once we finished iterating over the 2 columns, we want to go back to B0, and the offset between B0 and B7 is 7.
int incB1 = 2;
int incB2 = -5;
int sizeB1 = 3;
int countB1 = 0;
int sizeB2 = 1;
int countB2 = 0;
int offB = -7;
Here we want to iterate over A in a very specific way :
we want to iterate over a row 2 times (so that in the same time we iterate over the 2 columns of the B matrix) and then go to the next row.
To do that :
incA1 is 1 as we want to go from one element in a row to the next;
incA2 is -3 as once we finished iterating over a row we want to go back to the beginning of the row;
sizeA1 is 3 as once we are on the first element there is still 3 elements to jump on
sizeA2 is 1 as once we are on a row we still want to iterate over it 1 time.
offA is 1 as once we iterated 2 times ove the same row we want to jump to the next, which is 1 slot further in the array.
int incA1 = 1;
int incA2 = -3;
int sizeA1 = 3;
int countA1 = 0;
int sizeA2 = 1;
int countA2 = 0;
int offA = 1;
int incC = 1;
int sizeC = 1;
int countC = 0;
int offC = 1;
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
*iterateC += *iterateA * *iterateB;
iterateA = add_3d(iterateA, offA, incA1, countA1, sizeA1, incA2, countA2, sizeA2);
iterateA = add_3d(iterateB, offB, incB1, countB1, sizeB1, incB2, countB2, sizeB2);
}
iterateC = add_2d(iterateC, offC, incC, countC, sizeC);
}
|
dims_2d_t | dims_2d_from_steps (unsigned int size1, int step1, int step2) |
| Generate a dims_2d_t struct from the step values, instead of the increments. More...
|
|
dims_2d_t | dims_2d_from_steps (unsigned int size1, int step1, int step2, addr_t count1) |
| Generate a dims_2d_t struct from the step values, instead of the increments. More...
|
|
dims_3d_t | dims_3d_from_steps (unsigned int size1, int step1, unsigned int size2, int step2, int step3) |
| Generate a dims_3d_t struct from the step values, instead of the increments. More...
|
|
dims_3d_t | dims_3d_from_steps (unsigned int size1, int step1, unsigned int size2, int step2, int step3, addr_t count1, addr_t count2) |
| Generate a dims_3d_t struct from the step values, instead of the increments. More...
|
|
|
void * | add_2d_ptr (void *a, int off, int size1, int &count1, int inc1) |
|
void * | add_2d_byte (void *a, int off, int size1, int &count1, int inc1) |
|
|
void * | add_3d_ptr (void *a, int off, int size1, int &count1, int inc1, int size2, int &count2, int inc2) |
|
void * | add_3d_byte (void *a, int off, int size1, int &count1, int inc1, int size2, int &count2, int inc2) |
|
|
template<class T > |
T * | byte_incr (T *p, int i) |
| Increments input pointer by a number of bytes. More...
|
|
◆ dims_2d_t
◆ dims_2d_t() [1/2]
dims_2d_t::dims_2d_t |
( |
unsigned int |
size1, |
|
|
int |
inc1, |
|
|
int |
inc2 |
|
) |
| |
◆ dims_2d_t() [2/2]
dims_2d_t::dims_2d_t |
( |
unsigned int |
size1, |
|
|
int |
inc1, |
|
|
int |
inc2, |
|
|
addr_t |
count1 |
|
) |
| |
◆ count1
Represents the same as num1, but it acts as a counter for the intrinsic. This is useful when iterating.
◆ inc1
The offset between where you point and the next element on which you want to point. This can represent bytes or offset in index.
◆ inc2
The offset applied once you made the 1D sequence. This can represent bytes or offset in index.
◆ num1
unsigned int dims_2d_t::num1 |
Represents the number of element on wich you still need to jump on when you are at the beginning of a 1D sequence.
◆ dims_3d_t
Public Member Functions |
| dims_3d_t (unsigned int size1, int inc1, unsigned int size2, int inc2, int inc3) |
|
| dims_3d_t (unsigned int size1, int inc1, unsigned int size2, int inc2, int inc3, addr_t count1, addr_t count2) |
|
Data Fields |
addr_t | count1 |
|
addr_t | count2 |
|
int | inc1 |
|
int | inc2 |
|
int | inc3 |
|
unsigned int | num1 |
|
unsigned int | num2 |
|
◆ dims_3d_t() [1/2]
dims_3d_t::dims_3d_t |
( |
unsigned int |
size1, |
|
|
int |
inc1, |
|
|
unsigned int |
size2, |
|
|
int |
inc2, |
|
|
int |
inc3 |
|
) |
| |
◆ dims_3d_t() [2/2]
dims_3d_t::dims_3d_t |
( |
unsigned int |
size1, |
|
|
int |
inc1, |
|
|
unsigned int |
size2, |
|
|
int |
inc2, |
|
|
int |
inc3, |
|
|
addr_t |
count1, |
|
|
addr_t |
count2 |
|
) |
| |
◆ count1
Once you have iterated count1_in times over your structure the next call will point you on the element inc2 further. It will be decremented by inc1 on each call, once it reaches 0 the next call will point you on the element size1 further. It is a reference so that the intrinsic can keep up on where you are in your structure.
◆ count2
Once you have made count2_in 1D sequences on your array the next call will point you on the element off bytes further. Same. It will be decremented each time you make a 1D sequence. Once count1_out and count2_out reach 0, on the next call the output will point on the element off bytes further in your structure. It is a reference so that the intrinsic can keep up on where you are in your structure.
◆ inc1
The offset between where you point and the next element on which you want to point, forming the 1D sequence. This can represent bytes or the increment in index.
◆ inc2
The offset between the last element in the 1D sequence and the first of the next sequence, forming the 2D sequence. This can represent bytes or the increment in index.
◆ inc3
The offset between the last element of your 2D sequence and the first of the next, forming the 3D sequence. This can represent bytes or offset in index.
◆ num1
unsigned int dims_3d_t::num1 |
The number of elements in your 1D sequence.
◆ num2
unsigned int dims_3d_t::num2 |
The number of 2D sequences you want to make on in your structure.
◆ add_2d_byte()
void* add_2d_byte |
( |
void * |
a, |
|
|
int |
off, |
|
|
int |
size1, |
|
|
int & |
count1, |
|
|
int |
inc1 |
|
) |
| |
- Parameters
-
a | The starting point from which you want to iterate. |
off | The offset, in bytes, applied once you made the 1D sequence. |
size1 | Represents the number of element on wich you still need to jump on when you are at the beginning of a 1D sequence. If you iterate over a row in a matrix, it will be row_length - 1. |
count1 | Same, but here so that the intrinsic can keep count. |
inc1 | The offset, in bytes, between where you point and the next element on which you want to point forming the 1D sequence. |
Returns input pointer a incremented following the input parameters.
◆ add_2d_ptr()
void* add_2d_ptr |
( |
void * |
a, |
|
|
int |
off, |
|
|
int |
size1, |
|
|
int & |
count1, |
|
|
int |
inc1 |
|
) |
| |
- Parameters
-
a | The starting point from which you want to iterate. |
off | The offset, in index, applied once you made the 1D sequence. |
size1 | Represents the number of element on wich you still need to jump on when you are at the beginning of a 1D sequence. If you iterate over a row in a matrix, it will be row_length - 1. |
count1 | Same, but here so that the intrinsic can keep count. |
inc1 | The offset, in index, between where you point and the next element on which you want to point forming the 1D sequence. |
Returns input pointer a incremented following the input parameters.
◆ add_3d_byte()
void* add_3d_byte |
( |
void * |
a, |
|
|
int |
off, |
|
|
int |
size1, |
|
|
int & |
count1, |
|
|
int |
inc1, |
|
|
int |
size2, |
|
|
int & |
count2, |
|
|
int |
inc2 |
|
) |
| |
- Parameters
-
a | The starting point from which you want to iterate. |
off | The offset, in bytes, between the last element of your 2D sequence and the first of the next, forming the 3D sequence. |
size1 | The number of elements in your 1D sequence. |
count1 | Same. Once you have iterated count1_in times over your structure the next call will point you on the element inc2 further. It will be decremented by inc1 on each call, once it reaches 0 the next call will point you on the element size1 further. It is a reference so that the intrinsic can keep up on where you are in your structure. |
inc1 | The offset, in bytes, between where you point and the next element on which you want to point, forming the 1D sequence. |
size2 | The number of 2D sequences you want to make on in your structure. |
count2 | Same. Once you have made count2_in 1D sequences on your array the next call will point you on the element off bytes further. Same. It will be decremented each time you make a 1D sequence. Once count1_out and count2_out reach 0, on the next call the output will point on the element off bytes further in your structure. It is a reference so that the intrinsic can keep up on where you are in your structure. |
inc2 | The offset, in bytes, between the last element in the 1D sequence and the first of the next sequence, forming the 2D sequence. |
Returns input pointer a incremented following the input parameters.
◆ add_3d_ptr()
void* add_3d_ptr |
( |
void * |
a, |
|
|
int |
off, |
|
|
int |
size1, |
|
|
int & |
count1, |
|
|
int |
inc1, |
|
|
int |
size2, |
|
|
int & |
count2, |
|
|
int |
inc2 |
|
) |
| |
- Parameters
-
a | The starting point from which you want to iterate. |
off | The offset, in index, between the last element of your 2D sequence and the first of the next, forming the 3D sequence. |
size1 | The number of elements in your 1D sequence. |
count1 | Same. Once you have iterated count1 times over your structure the next call will point you on the element inc2 further. It will be decremented by inc1 on each call, once it reaches 0 the next call will point you on the element size1 further. It is a reference so that the intrinsic can keep up on where you are in your structure. |
inc1 | The offset, in index, between where you point and the next element on which you want to point, forming the 1D sequence. |
size2 | The number of 2D sequences you want to make on in your structure. |
count2 | Same. Once you have made count2 1D sequences on your array the next call will point you on the element off further. It will be decremented each time you make a 1D sequence. Once count1_out and count2_out reach 0, on the next call the output will point on the element off bytes further in your structure. It is a reference so that the intrinsic can keep up on where you are in your structure. |
inc2 | The offset, in index, between the last element in the 1D sequence and the first of the next sequence, forming the 2D sequence. |
Returns input pointer a incremented following the input parameters.
◆ byte_incr()
template<class T >
T* byte_incr |
( |
T * |
p, |
|
|
int |
i |
|
) |
| |
Increments input pointer by a number of bytes.
- Parameters
-
p | Input pointer. |
i | The amount of bytes to increment by. |
Returns input pointer p incremented by i.
◆ dims_2d_from_steps() [1/2]
dims_2d_t dims_2d_from_steps |
( |
unsigned int |
size1, |
|
|
int |
step1, |
|
|
int |
step2 |
|
) |
| |
Generate a dims_2d_t struct from the step values, instead of the increments.
◆ dims_2d_from_steps() [2/2]
dims_2d_t dims_2d_from_steps |
( |
unsigned int |
size1, |
|
|
int |
step1, |
|
|
int |
step2, |
|
|
addr_t |
count1 |
|
) |
| |
Generate a dims_2d_t struct from the step values, instead of the increments.
◆ dims_3d_from_steps() [1/2]
dims_3d_t dims_3d_from_steps |
( |
unsigned int |
size1, |
|
|
int |
step1, |
|
|
unsigned int |
size2, |
|
|
int |
step2, |
|
|
int |
step3 |
|
) |
| |
Generate a dims_3d_t struct from the step values, instead of the increments.
◆ dims_3d_from_steps() [2/2]
dims_3d_t dims_3d_from_steps |
( |
unsigned int |
size1, |
|
|
int |
step1, |
|
|
unsigned int |
size2, |
|
|
int |
step2, |
|
|
int |
step3, |
|
|
addr_t |
count1, |
|
|
addr_t |
count2 |
|
) |
| |
Generate a dims_3d_t struct from the step values, instead of the increments.