-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathsummary.txt
234 lines (161 loc) · 4.52 KB
/
summary.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
=====================
matmul-variations
=====================
N=100
Similar to matmul (A = A + B * C), except B and C becomes
(B + D) or (B * D) or their commuted version where D is
either another array or a constant.
Example of possibly generated code
A = A + (B + D) * (1.5 + C)
(Uppercase = matrix)
=====================
two-statements
=====================
N=100
Doubly-nested loop with two statements inside them.
Each statement looks like
A = _ * _ * _
Where A is always an array access, but _ can be either
an array access, a scalar, or a constant.
Also * can be either + or *
Example of possibly generated code
A = B * c + 1.5
C = c + c + D
=====================
two-statements-common-access
=====================
N=100
Similar to two-statements.
Except that the two statments are guaranteed to include at least one common array.
Example of possibly generated code
A = B * c + 1.5
C = c + c + B
Note that B appears in both statements.
Example of code that cannot be generated
A = B * c + 1.5
C = c + c + D
Note that the first statement has arrays A and B, while the second has C and D.
No array in common.
=====================
two-statements-mixed
=====================
N=250 two-statements
N=50 two-statements-common-access
=====================
vector
=====================
N=300
Vector (1D) operations
A = B @ C @ D
@ can be either + or *.
Left hand size is always memory access.
If left hand side is scalar, it is in the form S[0].
(Otherwise scalar variables are passed by value and can't be output.)
Right hand side includes
* array read, A[i], B[i], C[i], S[0]
* scalar read a, b, c
* literal 2.0
Note that this space includes
* dot product
S[0] = S[0] + A[i] * B[i]
* axpy
A[i] = a * B[i] + C[i]
* scale
A[i] = (a + b) * B[i]
=====================
2-2-ops
=====================
N=10
(Random seed 0)
Doubly nested loops with 1 statement.
Right hand side of each statement has 3 ops.
2 operands are guaranteed to be (2D) arrays.
Index is always [i][j].
The rest can be either scalars or literals.
Example:
A = B * C + D
B = a + B + A
=====================
3-ops
=====================
N=10
(Random seed 0)
Doubly nested loops with 2 statement.
Right hand side of each statement has 2 ops.
2 operands are guaranteed to be (2D) arrays.
Index is always [i][j].
The rest can be either scalars or literals.
Example:
A = B * C + 1.5 + b
=====================
nrm261
=====================
N=15
seed=0
In summary, the original nrm261 loop:
for [(i, j, k, m)] {
add = rhs[i][j][k][m];
rms[m] = rms[m] + add * add;
}
is modified a bit.
Loops i, j, k can be modified to have 1 iterations
Loops m has between 2 and 16 iterations.
Example naming:
The name "nrm261_j_k_m_13" means loops j and k have more than 1
iterations, and loop m has 13 iterations.
=====================
s242
=====================
N=15
seed=0
Original program:
for [(i, >=1, <=31999)] {
a[i] = a[i-1] + s1 + s2 + b[i] + c[i] + d[i];
}
New program:
for [(i, >=1, <=31999)] {
a[i] = a[i-1] + s1 + s2 + b[i] + c[i] + d[i] + e[i] + f[i] + g[i];
}
for [(i, >=1, <=31999)] {
a[i] = a[i-1] + s1 + s2 + b[i] + c[i] + d[i] + e[i] + f[i] + g[i];
}
Where one loop may be removed, and s1, s2, b[i], c[i], d[i], e[i], f[i], g[i] may be replaced by 0.
Example naming:
s242_aas1bce_aabc means
* two loops (loops are separated by underscores)
* the rhs of the first loop contains s1, b, c, e. (The first two a's are part of the recurrence.)
* the rhs of the second loop contains b, c
s242_aas1g_noop means
* one loop (the other is a no-op)
* the loop contains s1 and g
=====================
cnv243
=====================
Original program:
for [(i, >=0, <=nmor-1)] {
tmort[i] = tmort[i] / mormult[i];
}
New program:
for [(i, >=0, <=nmor-1)] {
tmort[i] = (tmort[i] * _ * _ * _) / (mormult[i] * _ * _ * _);
tmort[i] = (tmort[i] * _ * _ * _) / (mormult[i] * _ * _ * _);
}
Where one of the assignments can be replaced by a no-op.
And each _ can be replaced by 1, A[i], B[i], or C[i].
Example naming
cnv243_noop_tmort_mormultCBC means
* one loop (the first is a no-op)
* the assignment is
tmort[i] = tmort[i] / (mormult[i] * C[i] * B[i] * C[i])
cnv243_tmortB_mormultA_tmortBA_mormultB means
* two loops
* the assignment in the first loop
tmort[i] = tmort[i] * B[i]) / (mormult[i] * A[i])
* the assignment in the second loop
tmort[i] = tmort[i] * B[i] * A[i]) / (mormult[i] * B[i])
Note that 1s are omitted from the name, which means
1 * 1 * B[i]
and
B[i] * 1 * 1
will have the same name and in the code generator's point of view,
they are the same program.