xxyzz/ostep-hw

small bug that exists in main-common.c of hw 30

Closed this issue · 7 comments

In hw 30, if I use command ./main-one-cv-while -c 2 -p 1 -l 1 -P 0,0,0,0,0,0,2 -v which let two consumers sleep, then the whole program will crash. After some debug, I found a problem with the following code in main-common.c:

for (i = 0; i < consumers; i++) {
    Mutex_lock(&m);
    while (num_full == max) 
        Cond_wait(empty_cv, &m);
    do_fill(END_OF_STREAM);
    do_eos();
    Cond_signal(fill_cv);
    Mutex_unlock(&m);
}

when the producer is finished, END_OF_STREAM will be written into buffer and one of consumers will wake up. However, I found that it is not as I expected. It seems like that one of consumers will be set to ready state and the for loop will release the lock but immediately acquire the lock again. because num_full equals max at this time so the for loop will sleep. then one of consumers will execute and finally wake up one thread between the other consumer and the for loop. however, it will choose the other consumer rather than the for loop. so END_OF_STREAM will not be written into buffer again and cause the whole program to crash. If I change the code to the following, then there is no problem.

for (i = 0; i < consumers; i++) {
    sleep(1);
    Mutex_lock(&m);
    while (num_full == max) 
        Cond_wait(empty_cv, &m);
    do_fill(END_OF_STREAM);
    do_eos();
    Cond_signal(fill_cv);
    Mutex_unlock(&m);
}

I wonder whether this problem is OS-dependent or just a bug? Thanks a lot!

xxyzz commented

main-one-cv-while.c is a broken solution. ./main-one-cv-while -c 2 -P 0,0,0,0,0,0,1 will cause all threads to sleep like the Figure 30.11. The program will stuck at Cond_wait but won't crash and it's expected.

so this problem is relate to the implementation of consumer? since the consumer will run infinitely until main-common.c writes END_OF_STREAM

xxyzz commented

The correct solution uses two condition variables, you should reread this chapter...

uh...I know this is an incorrect solution. I just want to know why this command will cause the program to stuck in such a place. and I think it's because the thread executing for loop in main-common.c will sleep just because the for loop will soon grab the lock after releasing it and END_OF_STREAM will have no chance to be written again. The main-common.c's logic seems like to be that all consumers should exit after all producers have exited but it fails in such case.

I know that these two consumers will sleep but the file main-common.c will write END_OF_STREAM into the buffer and then wake up them.

xxyzz commented

The book already explains using only one condition variable will cause one consumer wakes up another consumer accidentally instead of the producer(or the main thread). The for loop in main-common.c uses the same code in main-one-cv-while.c, the dead lock has nothing to do with what it writes.

so the main thread is kind of another producer. that makes sense, thanks a lot !