va values vary a lot on Rafd dataset

Question

va values vary a lot on Rafd dataset

liyu10000 opened this issue 5 years ago · 3 comments

Hi,

Thank you for providing the pretrained weights and evaluation codes. I tested the vggface model on the Rafd dataset and found that the va values of some categories such as sad or fearful vary a lot. They may even cross the axis. I only used the frontal face and cropped them before evaluation. Because the Rafd dataset was collected under lab environment, I thought each category should span a constrained region in va plane. It turned out not. Do you know any reasons about this? One cause I could guess is that your models were trained with wild dataset which may not fit in the lab controlled data.

Best,
Li

Here is the scatter plot of va points per each emotion in Rafd dataset:

Answer 1 · 2020-03-19T17:36:11.000Z

Hello Li,

there are some things to comment:

I guess the cropping you did on the faces was not the same cropping that the VGG FACE was trained on, right?
As a matter of face the disgust, sad, fear (and angry) classes are always difficult to distinguish between each other: either if we are talking about humans or machines (deep learning systems). In almost all 7 basic expression databases that I am aware, these 3 classes are the most 'confusing' ones
there exists a lot of research in psychology regarding the exact position of these 3 classes in the 2D valence-arousal space or in general the affect circumplex. Most of these studies are a bit contradictory, regarding where they place these 3 classes (most of these place them very close in the 2nd quadrant, which is also the case here). To my understanding what i see here makes almost perfect sense: the neutral class is in the area of (0,0) (ideally it that exact spot; obviously because we have muscle movements etc its surrounding this area and in your plot its in [-2,2], legit to me); the surprise class is in the first and second quadrants (positive or negative surprise, so makes sense); happy class is in the first quadrant; I already explained regarding the other 3 classes which seem logical according to the essence of what is valence and arousal; also the fact that sadness goes into the 3rd quadrant makes sense; the only thing that is not correct is that there are some instances of fear class that are in the first quadrant (it's a mistake of either the difference in the cropping or the network confuses fear expressions with maybe happy ones)

Generally what I suggest is not have rules in your head ('should span a constrained region') but try to interpret the result and see whether it makes sense or not (valence = positive/negative emotion, arousal = passive/active). And I disagree with the provided reason regarding lab controlled vs in-the-wild (models trained with in the wild data are more realistic and mainly predict correct lab-controlled expressions unless maybe if they are very extreme and not realistic).

Hope my answer helps!

Answer 2 · 2020-03-21T01:37:35.000Z

Hello Li,

there are some things to comment:

I guess the cropping you did on the faces was not the same cropping that the VGG FACE was trained on, right?

As a matter of face the disgust, sad, fear (and angry) classes are always difficult to distinguish between each other: either if we are talking about humans or machines (deep learning systems). In almost all 7 basic expression databases that I am aware, these 3 classes are the most 'confusing' ones

there exists a lot of research in psychology regarding the exact position of these 3 classes in the 2D valence-arousal space or in general the affect circumplex. Most of these studies are a bit contradictory, regarding where they place these 3 classes (most of these place them very close in the 2nd quadrant, which is also the case here). To my understanding what i see here makes almost perfect sense: the neutral class is in the area of (0,0) (ideally it that exact spot; obviously because we have muscle movements etc its surrounding this area and in your plot its in [-2,2], legit to me); the surprise class is in the first and second quadrants (positive or negative surprise, so makes sense); happy class is in the first quadrant; I already explained regarding the other 3 classes which seem logical according to the essence of what is valence and arousal; also the fact that sadness goes into the 3rd quadrant makes sense; the only thing that is not correct is that there are some instances of fear class that are in the first quadrant (it's a mistake of either the difference in the cropping or the network confuses fear expressions with maybe happy ones)

Generally what I suggest is not have rules in your head ('should span a constrained region') but try to interpret the result and see whether it makes sense or not (valence = positive/negative emotion, arousal = passive/active). And I disagree with the provided reason regarding lab controlled vs in-the-wild (models trained with in the wild data are more realistic and mainly predict correct lab-controlled expressions unless maybe if they are very extreme and not realistic).

Hope my answer helps!

Hi Kollias,

Thank you for your kind reply and explanation. I am much clearer now on how to interpret the va scores. To answer your first question, I used the opencv haar face detection method to crop the face which I think is different to the cropping used in VGG FACE. That may be the reason why something unexpected happens.

Your reasoning on the placement of emotion categories on va space is very helpful. I really appreciate your kind reply.

Li

Answer 3 · 2020-03-31T01:12:43.000Z

Hi,
Thank you for providing the pretrained weights and evaluation codes. I tested the vggface model on the Rafd dataset and found that the va values of some categories such as sad or fearful vary a lot. They may even cross the axis. I only used the frontal face and cropped them before evaluation. Because the Rafd dataset was collected under lab environment, I thought each category should span a constrained region in va plane. It turned out not. Do you know any reasons about this? One cause I could guess is that your models were trained with wild dataset which may not fit in the lab controlled data.
Best,
Li
Here is the scatter plot of va points per each emotion in Rafd dataset:

hello，could you share the RaFD dataset with me by mailbox：1647560307@qq.com ，i am Anxious to do Undergraduate graduation design on it ，thanks very much