Which optimization strategies would you suggest for the piece of code below? The code is contained in the OPT/ subfolder, in F90 and C versions. You can build it with
double mat[N][N], s[N][N], val;
int i, j, v[N];
//
// ... v[] and s[][] may be assumed to contain valid data
//
for(i=0; i<N ; ++i) {
No assumptions about the size of N may be made. You may, however, assume that the code is part of a function that gets called very frequently. s[][] and v[] may change between calls, and all entries of v[] are positive.
Can you develop a performance model for your optimized code? What is the code balance for large N? What is the latency for evaluating the cos() function?
$ make OPT_F.exe # Fortran build
$ make OPT_C.exe # C build
double mat[N][N], s[N][N], val;
int i, j, v[N];
//
// ... v[] and s[][] may be assumed to contain valid data
//
for(i=0; i<N ; ++i) {
for(j=0; j<N; ++j) {
}val = (double)(v[i] % 256);
mat[j][i] = s[j][i]*(sin(val)*sin(val)-cos(val)*cos(val));
}mat[j][i] = s[j][i]*(sin(val)*sin(val)-cos(val)*cos(val));
No assumptions about the size of N may be made. You may, however, assume that the code is part of a function that gets called very frequently. s[][] and v[] may change between calls, and all entries of v[] are positive.
Can you develop a performance model for your optimized code? What is the code balance for large N? What is the latency for evaluating the cos() function?