How to train and test MemSeg in other datasets
jamdodot opened this issue · 11 comments
Hello, sorry to bother you. How can i set up training and testing on other datasets. such as KolektorSDDs. Do I first have to adjust the structure of the dataset and then make some changes in the code? 🙏 @TooTouch
For example, the new data set is 228 x 630 pixels. Should it be adjusted to 256x256?
I mainly changed the anomaly_mask.json and the datadir and batch_size in the configs.yaml file, and modified the new dataset to follow the MVTec dataset arch. Now the code can run and train, but SEED: 42 has not changed. Does this mean that the parameters are still the same as before
Hi, @jamdodot
For example, the new data set is 228 x 630 pixels. Should it be adjusted to 256x256?
Converting to 256 x 256 is the easiest way to go. However, if the proportions of your existing image are important to you, you may want to leave it unconverted.
I mainly changed the anomaly_mask.json and the datadir and batch_size in the configs.yaml file, and modified the new dataset to follow the MVTec dataset arch. Now the code can run and train, but SEED: 42 has not changed. Does this mean that the parameters are still the same as before
Yes, the seed is for reproducibility to produce the same result.
Thank you for your answer. I still have some questions I would like to ask you.
- seed parameters can be unchanged ?
- Metrics will be calculated every 100 iterations. I added two lines of printing information in evaluate function.
image_masks = np.array(image_masks)
anomaly_map = np.array(anomaly_map)
print(image_targets)
print(anomaly_score)
auroc_image = roc_auc_score(image_targets, anomaly_score)
What makes me confused is that the value of the image_targets array does not change every time
3. My training graph looks really weird and I don't know what's wrong. can you give me some advice 😭 🙏
wandb-Link
output.log
A1. The seed parameter can be changed with any number.
A2. The value of the image_targets array does not change every time, because the shuffle is False for testloader
.
A3. Can you explain what looks weird?
I modified the focal loss few minutes ago.
You can try again with a modified focal loss. (#22, c3c6e99)
thanks ,I can use jupyter to view the reasoning of each picture in the test collection(ALL in test collection).Some segmentation effects are indeed not very good. Below are images of metrics,They don't rise gradually, they always go up and down.
I want to know if mine is running normally but with low indicators, or if there is something wrong with the initial configuration.
I think that the fluctuation can be reduced with lower learning rate.
Since the evaluation scores have pretty much converged since the beginning, I think that with a small learning rate, a small parameter update should be enough to improve performance.
I would like to ask about the N normal samples in the memory bank. What is the value of N? How to determine its value? 🤔 @TooTouch
There is no criterion for the N
normal samples of the memory bank.
The value of N
in this repo is the value mentioned in MemSeg paper.
N should be specified as a sample size large enough to capture all the features of normal data.
This will vary depending on the characteristics of your data, but you can use your domain knowledge to determine this.
Why does the test path need to contain good samples in the MVTech AD data set structure? If there is no good sample file, the following error will occur:
only one class present in y_true. roc auc score is not defined in that case
Is it to calculate AUROC? @TooTouch
I'm sorry too late for reply.
The good
samples need to calculate AUROC. Because AUROC is the area under ROC curve that is constructed by true positive rate and false positive rate.
To calculate the true positive rate and false positive rate need good
class in binary class setting.
https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5