Survey Question on the For loop in source file cpu/algorithms/memory_simd_avx.c line 150
Closed this issue · 1 comments
Hello Sir/ Madam
We are from a research group at Iowa State University, USA. We want to do a survey on Github developers on the methods they used for paralleling their code. To do the survey, We want to ask three questions about this for loop:
-
Can you briefly explain the purpose of using pragma for this case? If the pragma contained reduction and private clauses, can you briefly mention the purposes of variables in those clauses?
-
How much confidence do you have about the correctness of this implementation? You can choose from 1-5 with 1 as the lowest confidence score and 5 as the highest confidence score.
-
(Optional) Do you actually run (interpret the code with compilation and pass input/get output) the code and see the optimization of parallelization? Yes/No
- If yes, can you provide the information of what are the input and expected output of this program (the input that caused the program to run through this for-loop).
The for loop is from line 150 of file https:/github.com/XiangRongLin/grayscale-conversion/blob/master/cpu/algorithms/memory_simd_avx.c
Here is a part of the code:
omp parallel for
for (int thread = 0; thread < threads; thread++)
{
int end;
if ((thread + 1) == threads)
{
end = (((int) size) / pixel_per_iteration) * pixel_per_iteration;
}
else
{
end = pixel_per_thread_aligned * (thread + 1);
}
int rF_rE_rD_rC_rB_rA_r9_r8_r7_r6_r5_r4_r3_r2_r1_r0;
int gF_gE_gD_gC_gB_gA_g9_g8_g7_g6_g5_g4_g3_g2_g1_g0;
int bF_bE_bD_bC_bB_bA_b9_b8_b7_b6_b5_b4_b3_b2_b1_b0;
for (int i = pixel_per_thread_aligned * thread; i < end; i += pixel_per_iteration)
{
cons
Sincerely thanks
- The purpose is to parallelize the for loop, since each iteration operates on exclusivly separate data.
- 5 The correctness is verified manually by looking at the resulting gray scale image. Byte comparison is not possible because approximations were introduced in order to achieve a faster execution time.
- Yes, here are the benchmarks: https://github.com/XiangRongLin/grayscale-conversion#benchmarks
- See updated Readme https://github.com/XiangRongLin/grayscale-conversion/tree/master/cpu#how-to-run For your usecase use
1 12 1 5
, this will produce a file namedgrayscale.jpg
. The second number (number of threads) is up to you.