nuclearboy95/Anomaly-Detection-PatchSVDD-PyTorch

A trick to accelerate convergence

GlassyWing opened this issue · 11 comments

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Hi, I tried the trick but seems to have no acceleration for the convergence.
Is it to add an negative example, say P3, to svdd loss, and make P1 and P2 as positive examples, and P1 and P3 as negative examples? Then we can maximize and minimize them respectively?

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Hi, I tried the trick but seems to have no acceleration for the convergence.
Is it to add an negative example, say P3, to svdd loss, and make P1 and P2 as positive examples, and P1 and P3 as negative examples? Then we can maximize and minimize them respectively?

exactly, pseudo code:

h1 = enc(p1)
h2 = enc(p2)
h3 = enc(p3)

loss_pos = 1. - cos_sim(h1, h2)
loss_neg = 1. - cos_sim(h1, h3)

loss = loss_pos - loss_neg

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Hi, I tried the trick but seems to have no acceleration for the convergence.
Is it to add an negative example, say P3, to svdd loss, and make P1 and P2 as positive examples, and P1 and P3 as negative examples? Then we can maximize and minimize them respectively?

exactly, pseudo code:

h1 = enc(p1)
h2 = enc(p2)
h3 = enc(p3)

loss_pos = 1. - cos_sim(h1, h2)
loss_neg = 1. - cos_sim(h1, h3)

loss = loss_pos - loss_neg

Many thanks!

I fail to realize the paper how to local the defect. Waiting for a reply.Thanks

I fail to realize the paper how to local the defect. Waiting for a reply.Thanks

In generally, you need choose a threshold to get the binarized image after get anomaly map, then you could draw the bbox or contours with opencv. There are many ways to choose the threshold, usually based on the ROC curve.

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

@GlassyWing But how to choose positive example? Thanks!

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples?

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

@GlassyWing But how to choose positive example? Thanks!

A patch with a higher overlap with the selected patch can be used as a positive example, perhaps more than 3/4 of the overlap rate. In addition, you can directly perform image enhancement as a positive example, such as adjusting brightness and contrast. All methods can refer to knowledge in the unsupervised comparative learning field.

一个epoch跑多久,为什么gpu占用率很低