g++ -g -Wall -o hw4 hw4.cpp -lpthread -lrt
no difference betweeen non_pthread_output and pthread_output
1.seperate rows of bmp_img to each threads :
assign remainder to the last thread(due to hardware limitaions,# of threads won't be too large)
2. pthread_barrier_wait(&barrier);
synchronize each threads operation
3. use strtol(argv[1], NULL, 10);
to get # of threads at runtime , convert string to int (in base 10)
4. sequentially create and join threads by for loop
5. get_time() to calculate runtime
reference: (stack overflow) https://stackoverflow.com/questions/15976790/how-can-i-calculate-the-running-time-of-a-pthread-matrix-multiplication-program
note : I've tested three times each cases, and take the smallest run time as example
case1 : 1000 smoothing time
- There is hardware limitation ; take (16_threads 2000 loops) as an extreme example.The runtime is even slightly longer than the 8 threads one.
- runtime of thread6 is longer than threads4 in both cases ->the reason is that bmpinfo.biheight can't be devided by 6
case2 : 2000 smoothing time
1.having a hard time simply use time.h ctime() function solved by referencing this stack overflow example: https://stackoverflow.com/questions/15976790/how-can-i-calculate-the-running-time-of-a-pthread-matrix-multiplication-program
totally appreciate to you guys , those meme posted on moodle are dope asf!!