A trick to accelerate convergence

Question

A trick to accelerate convergence

GlassyWing opened this issue 4 years ago · 11 comments

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Answer 1 · 2020-12-29T01:45:08.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

Answer 2 · 2021-01-06T08:32:05.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Answer 3 · 2021-01-13T05:37:01.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Hi, I tried the trick but seems to have no acceleration for the convergence.
Is it to add an negative example, say P3, to svdd loss, and make P1 and P2 as positive examples, and P1 and P3 as negative examples? Then we can maximize and minimize them respectively?

Answer 4 · 2021-01-13T06:31:49.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Hi, I tried the trick but seems to have no acceleration for the convergence.
Is it to add an negative example, say P3, to svdd loss, and make P1 and P2 as positive examples, and P1 and P3 as negative examples? Then we can maximize and minimize them respectively?

exactly, pseudo code:

h1 = enc(p1)
h2 = enc(p2)
h3 = enc(p3)

loss_pos = 1. - cos_sim(h1, h2)
loss_neg = 1. - cos_sim(h1, h3)

loss = loss_pos - loss_neg

Answer 5 · 2021-01-13T13:53:36.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

Hi, I tried the trick but seems to have no acceleration for the convergence.
Is it to add an negative example, say P3, to svdd loss, and make P1 and P2 as positive examples, and P1 and P3 as negative examples? Then we can maximize and minimize them respectively?

exactly, pseudo code:
h1 = enc(p1)
h2 = enc(p2)
h3 = enc(p3)

loss_pos = 1. - cos_sim(h1, h2)
loss_neg = 1. - cos_sim(h1, h3)

loss = loss_pos - loss_neg

Many thanks!

Answer 6 · 2021-01-15T11:59:57.000Z

I fail to realize the paper how to local the defect. Waiting for a reply.Thanks

Answer 7 · 2021-01-18T02:58:48.000Z

I fail to realize the paper how to local the defect. Waiting for a reply.Thanks

In generally, you need choose a threshold to get the binarized image after get anomaly map, then you could draw the bbox or contours with opencv. There are many ways to choose the threshold, usually based on the ROC curve.

Answer 8 · 2021-02-19T02:02:00.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

@GlassyWing But how to choose positive example? Thanks！

Answer 9 · 2021-02-19T02:21:14.000Z

Thanks ！发自我的

…

在 2021年1月18日，10:59，pure glay ***@***.***> 写道： I fail to realize the paper how to local the defect. Waiting for a reply.Thanks In generally, you need choose a threshold to get the binarized image after get anomaly map, then you could draw the bbox or contours with opencv. There are many ways to choose the threshold, usually based on the ROC curve. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Answer 10 · 2021-02-19T06:43:30.000Z

Training with the default config, it can hardly works and tends to learn trivial solution. However, it only needs to modify a small part, introduce negative examples on svdd loss (just extract different patches), now it can converge very quickly and achieve the effect quickly.

Can you describe it more clearly? What do you mean by negative samples？

This is similar to the twin neural network. The negative example is to randomly select one of the 8 patches adjacent to the current patch. And then maximize the cosine similarity of positive examples and minimize the cosine similarity of negative examples.

@GlassyWing But how to choose positive example? Thanks！

A patch with a higher overlap with the selected patch can be used as a positive example, perhaps more than 3/4 of the overlap rate. In addition, you can directly perform image enhancement as a positive example, such as adjusting brightness and contrast. All methods can refer to knowledge in the unsupervised comparative learning field.

Answer 11 · 2021-03-18T06:55:45.000Z

一个epoch跑多久，为什么gpu占用率很低