Using my own dataset ?

Question

Using my own dataset ?

ahariri13 opened this issue 5 years ago · 3 comments

Hi ! I am using my own dataset named PHY on DGCNN. I have shaped my data as mentioned in the readme file.
However, my data is big enough (300 000) and I prefer to proceed without 10-fold cross-validation (training on 230 000-testing on 70 000). For this, I ran ./run_DGCNN.sh Phy 1 70000. Despite the fact that it's training and the loss is decreasing, I am getting a 0 test accuracy and the following message:
_UndefinedMetricWarning: No positive samples in y_true, true positive value should be meaningless._I googled the warning and apparently it thinks all my labels are 0 (while I should be having 4 class labels). What do you think the problem is ?

Also, could you point out where I can find how much Dropout is used after a Dense layer, and where I can check the number of Fully-Connected layers used? (not the #of nodes just the # of hidden layers).

Answer 1 · 2019-11-17T17:37:05.000Z

Hi, along with accuracy, DGCNN also tries to output AUC results (which only makes sense for binary classification problems). The warning might be related to AUC. Also, can you post a screenshot of your output results and error messages? Without them it is hard to diagnose the problem. Thanks.

For the second question, please refer to line.

Answer 2 · 2019-11-29T10:41:39.000Z

Thank you for your reply. Apparently the problem was from an error in Graph labeling. However i still have a slight problem: I have 4 nodes each having 2 features and 3 edges connecting them having 1 feature each. But how do I account for edge features in the text file? I even ran the matlab code with the edge attributes and labels in it but I only saw the 2 node features accounted for in the text file.

Any help is appreciated :)

Answer 3 · 2019-11-30T19:03:14.000Z

Hi, unfortunately this implementation does not support edge features. Using edge features requires extra modules during the message passing.