Replicate data augmentation as done by current state of the art

Question

Replicate data augmentation as done by current state of the art

Closed this issue a year ago · 42 comments

Discussion of what method would be better to implement this feature:

Current ideas:

First I tried using the same dataset, detect the objects and then place black boxes on them but had many false positives or not detecting some of them.

I then tried to read movement from a dataset with objects and replicate its camera movement. But the black boxes will be random.

Answer 1 · 2023-03-20T23:09:30.000Z

Hi @andrefdre ,

just for me to understand better: the goal is to have images augmented with a 3d model in the case of our approach, and then to have this exact same image augmented with a blackbox.

Is that it? I mean, if you do not need the exact same images you could just generate a random black box of variable position and size...

Answer 2 · 2023-03-20T23:22:32.000Z

Is that it?

Yes that's it.

During the last meeting, I don't remember who said it, but someone said it would be better to have the black boxes covering the objects. So when arguing that our model is better we could say the dataset has augmentation at the same place, just has more realistic with the 3d model.

Answer 3 · 2023-03-20T23:24:07.000Z

So my suggestion is to run a yolo detector on the images looking for persons and chairs, and wherever you detect persons paint the box black.

Answer 4 · 2023-03-20T23:38:44.000Z

So my suggestion is to run a yolo detector on the images looking for persons and chairs, and wherever you detect persons paint the box black.

Didn't try with yolo will try it tomorrow. Today I only tested with mediapipe.

Answer 5 · 2023-03-21T15:09:08.000Z

Hi @miguelriemoliveira and @andrefdre,

I would say to compare our approach to the state-of-the-art, which I think is this paper: Random Erasing Data Augmentation. https://arxiv.org/pdf/1708.04896.pdf

In pytorch you can use: https://pytorch.org/vision/main/generated/torchvision.transforms.RandomErasing.html

Answer 6 · 2023-03-21T15:13:24.000Z

I would say to compare our approach to the state-of-the-art, which I think is this paper: Random Erasing Data Augmentation.

So your suggestion is to just do data augmentation randomly (in a dataset without objects) instead of covering the objects?

Answer 7 · 2023-03-21T15:52:19.000Z

So your suggestion is to just do data augmentation randomly (in a dataset without objects) instead of covering the objects?

I think the baseline should be the random erasing. Additionally, we can also compare with the method you described, but I have never seen anyone doing that... I doubt that there is a paper we can cite. Nevertheless, we can always say that it is an improvement of the Random Erasing Data Augmentation.

Answer 8 · 2023-03-21T17:02:03.000Z

I would say that we could do both: 1- Our method (3D objects), 2- SoA Random erasing at first. Then if everything goes smoothly we could try removing of objects directly in images as an alternative.

Since the training is done based on the RGB images I guess the 3D method vs segmentation and removal of objects in 2D images should yield to similar results (?). The only diference wil be the use of a box instead of the real objects.

Another relevant question in my opinion is what to test in the training set... The random method might work well with random images but how is it going to work if tested with real 3D objects projected on 2D images. Need to think a little bit about it!

Anyway for now I would say: Our methods vs random erasing and then we will see.

Answer 9 · 2023-03-21T17:45:06.000Z

I'm currently creating a way of replicating the camera steps of another dataset. With that implemented I will train two models one with our method and another with random erasing.

Answer 10 · 2023-03-21T17:57:56.000Z

I agree we can try both the random and the bbox where the objects are.

Since the training is done based on the RGB images I guess the 3D method vs segmentation and removal of objects in 2D images should yield to similar results (?). The only diference wil be the use of a box instead of the real objects.

I hope not, I hope results are much better with the objects in 3D. That is actually the reasoning behind this work, that using the 3D objects instead of rough bboxes will improve the results.

Answer 11 · 2023-03-21T18:11:00.000Z

Hi @andrefdre ,

I'm currently creating a way of replicating the camera steps of another dataset. With that implemented I will train two models one with our method and another with random erasing.

not sure I understand this. I mean, for creating a new dataset using random erasing don't you just have to copy the entire dataset with objects, and them go through all images and randomize a box position and size, and put it on the image?

Answer 12 · 2023-03-21T18:15:56.000Z

Sorry @andrefdre , now I understand why we need what you are discussing. Because for the random boxes they should be inserted in images without the objects.

Answer 13 · 2023-03-21T18:55:05.000Z

If we placed the black boxes on top of the object, we could just create a copy of the dataset. But from my tries, it doesn't seem like I could automate detecting the objects.
At least yolo doesn't have classes for all the objects I have.

Answer 14 · 2023-03-21T19:34:42.000Z

Right. Better develop that mechanism

…

On Tue, Mar 21, 2023, 18:55 André Cardoso ***@***.***> wrote: If we placed the black boxes on top of the object, we could just create a copy of the dataset. But from my tries, it doesn't seem like I could automate detecting the objects. At least yolo doesn't have classes for all the objects I have. — Reply to this email directly, view it on GitHub <#95 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVWUWS2R2J5ZFYT6JZDW5H2RHANCNFSM6AAAAAAWBWXI5A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 15 · 2023-03-30T18:07:58.000Z

These are very strange images ...

Answer 16 · 2023-05-12T12:37:11.000Z

I started getting results for lights and our method seems good, but the augmentation as done by the state of the art and both methods together seems very weird.

I used this for augmentation as suggested previously:

transforms.ColorJitter(brightness=.5, hue=0)

@DanielCoelho112 and @miguelriemoliveira do you have any theory?

Answer 17 · 2023-05-12T15:04:19.000Z

Hi, If the augmentation has 2.8 meters, thats the ptoblem. No need to run the both column. I wanted to see the images in the dataset after augmentation. Dif you look at them? Why don't you do a video as you did for the others...

…

On Fri, May 12, 2023, 1:37 PM André Cardoso ***@***.***> wrote: I started getting results for lights and our method seems good, but the augmentation as done by the state of the art and both methods together seems very weird. [image: image] <https://user-images.githubusercontent.com/58526188/237971784-2aee43b6-f1a4-4f70-a211-8cfe8b86fdba.png> I used this for augmentation as suggested previously: transforms.ColorJitter(brightness=.5, hue=0) @DanielCoelho112 <https://github.com/DanielCoelho112> and @miguelriemoliveira <https://github.com/miguelriemoliveira> do you have any theory? — Reply to this email directly, view it on GitHub <#95 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVUPYWNGCADFPNLUJU3XFYVIFANCNFSM6AAAAAAWBWXI5A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 18 · 2023-05-12T15:07:12.000Z

Video with brightness augmentation https://www.youtube.com/watch?v=Y8KI9bcpeD4

Answer 19 · 2023-05-12T15:54:57.000Z

Thanks. I think the darkest images are too dark. Can you post some of the darkest images?

Answer 20 · 2023-05-12T16:38:50.000Z

Can you post some of the darkest images?

Answer 21 · 2023-05-12T18:30:38.000Z

Too dark, don't you think?

Answer 22 · 2023-05-12T20:50:54.000Z

Too dark, don't you think?

I agree will generate a video with 0.3

Answer 23 · 2023-05-12T21:03:35.000Z

Darkest image with 0.3:

Answer 24 · 2023-05-13T00:16:40.000Z

A video? You mean a dataset? Also, am I supposed to know what the 0.3 means :- ) ?

The image is still very darl, I think. A bit less perhaps ... @DanielCoelho112 ? You do you say?

Answer 25 · 2023-05-13T11:03:11.000Z

A video? You mean a dataset? Also, am I supposed to know what the 0.3 means :- ) ?

0.3 from the documentation means the amount to jitter the brightness. Video with 0.3: https://youtu.be/1r5tloG-RGo

The image is still very darl, I think. A bit less perhaps ... @DanielCoelho112 ? You do you say?

I just have the thought that we are only darkening the image, but don't we also want the opposite? When comparing the dataset with lights, it has brighter images as well as darker images.

Answer 26 · 2023-05-13T18:36:40.000Z

Hi @miguelriemoliveira, @andrefdre,

I just have the thought that we are only darkening the image, but don't we also want the opposite? When comparing the dataset with lights, it has brighter images as well as darker images.

We want both cases. Darker and lighter images.

@DanielCoelho112 and @miguelriemoliveira do you have any theory?

Since the model is worse than the baseline, I would follow the suggestion of @miguelriemoliveira and reduce the magnitude of the augmentation applied.

When we apply extreme augmentations, the models usually perform poorly (https://research.unl.pt/ws/portalfiles/portal/44736222/Brightness_as_an_Augmentation_Technique_for_Image_Classification.pdf).

Answer 27 · 2023-05-13T19:54:17.000Z

Right. I agree. Also brighten the images and reduce the maximum magnitude of the transformation.

Answer 28 · 2023-05-13T19:54:43.000Z

I get dizzy when watching the video ...

Answer 29 · 2023-05-15T14:51:47.000Z

Right. I agree. Also brighten the images and reduce the maximum magnitude of the transformation.

I read through the colorJitter documentation again and finally understood how to increase the brightness. So the brightness varies between 0.5 of the original image and 2 of the original image. The video with these settings: https://youtu.be/buwznRVJDWY I also lowered the fps in the video, hope now it looks better.
Should I increase max brightness or lower brightness?

After the current training, I will train with these new settings if @miguelriemoliveira agrees with this configuration.

Right now I continued training the models that don't require augmentation for lights and objects study.

Answer 30 · 2023-05-15T16:01:17.000Z

The video looks much better now. In any case its hard to check the appearance of the darkest and brightest images. Can you post them?

Answer 31 · 2023-05-15T16:36:08.000Z

Darkened:

Original:

Brightned:

Original:

Answer 32 · 2023-05-15T16:38:29.000Z

I think the darkened is too dark ... Daniel?

…

On Mon, May 15, 2023, 5:36 PM André Cardoso ***@***.***> wrote: Darkened: [image: frame-00037 rgb] <https://user-images.githubusercontent.com/58526188/238412859-8ca8c3fe-739e-4332-83ab-f733243c75aa.png> Original: [image: frame-00037 rgb] <https://user-images.githubusercontent.com/58526188/238413342-499f79e4-cf7f-4f66-bdc5-f9380d7b9e91.png> Brightned: [image: frame-00027 rgb] <https://user-images.githubusercontent.com/58526188/238412938-54bb03d9-7baa-473b-91ec-521ac4a23707.png> Original: [image: frame-00027 rgb] <https://user-images.githubusercontent.com/58526188/238413401-8d5b9a9b-f5f3-4181-93be-0039ee3d8ada.png> — Reply to this email directly, view it on GitHub <#95 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVTTHX2VH235QG55U7TXGJLQJANCNFSM6AAAAAAWBWXI5A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 33 · 2023-05-15T16:47:07.000Z

Not sure... I think you should compare that image with an image taken from the scene without lights. That way we can see how dark we can get images.

Answer 34 · 2023-05-15T16:51:05.000Z

I think you should compare that image with an image taken from the scene without lights

Without lights? This is from a dataset without any lighting, and then data augmentation was used.

Answer 35 · 2023-05-15T17:01:00.000Z

Hm, forgot about that... I think we should've applied the data augmentation on a dataset with some lights on to simulate what happens in reality. But it's not critical.

Given this, I agree with @miguelriemoliveira. I think we should reduce the level of darkness applied.

Maybe this is the reason why the results were so bad. We are choosing the darkest images, and then we are darkening them even more. And then in the test set all images are brighter.

Answer 36 · 2023-05-15T17:07:57.000Z

These images are from a dataset with lights. I don't know if it's because they are darker, one thing is for sure the results weren't good due to only darkening the images which I now will also brighten them.
Dark:

Bright:

Answer 37 · 2023-05-15T21:09:43.000Z

Hi André, Not sure I understood what you said ...

…

On Mon, May 15, 2023, 6:08 PM André Cardoso ***@***.***> wrote: These images are from a dataset with lights. I don't know if it's because they are darker, one thing is for sure the results weren't good due to only darkening the images which I now will also brighten them. Dark: [image: frame-00059 rgb] <https://user-images.githubusercontent.com/58526188/238419854-d8764581-4824-40d4-9de9-2f4c6bfb4ced.png> Bright: [image: frame-00025 rgb] <https://user-images.githubusercontent.com/58526188/238419736-460d80e0-31fc-48b7-b12b-ab4e45bf118f.png> — Reply to this email directly, view it on GitHub <#95 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVSPYWPH2C4CFKSHHXTXGJPHPANCNFSM6AAAAAAWBWXI5A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 38 · 2023-05-15T21:31:22.000Z

Not sure I understood what you said ...

The augmentation I was doing was only darkening images, not brightening them.

Answer 39 · 2023-05-15T22:22:55.000Z

Ok, so I think we have a good reason why the training could be going wrong.

But again, my point is that we cannot have images so dark like the one you have above. We should darken and brighten, but not so much as to have the image almost all black or white. In those extreme cases we cannot expect a good localization.

Answer 40 · 2023-05-15T22:25:04.000Z

But again, my point is that we cannot have images so dark like the one you have above. We should darken and brighten, but not so much as to have the image almost all black or white. In those extreme cases we cannot expect a good localization.

The last images we have good results, since it's the results for our implementation. I didn't explain it properly, so I'm hopeful it will improve now.

Answer 41 · 2023-05-16T09:28:08.000Z

Hi, Sorry fro the silence these days, but off a couple of days and many other duties. I think if not, André can at least open a virtual nightclub in a church with strobe effects :-)! I'm not sure I'm following everything, but the question on the "what is too dark" is not easy... The room might be darker and this can influence the location but this is true for any system. maybe the easiest would be to consider the model as baseline and only add light at least to validate the model. This is an engineering problem I guess: If we dim light, location will be worse (even for an human) but not because of our approach but because camera have less information... At this point the important is to think about a method to validate our method. Just increasing light should do the trick: if it improves, it shows the system is more robust with light changes. Actually the best validation will always be with real data from a real room at different daylights. Not an easy answer!

Answer 42 · 2023-05-16T15:16:37.000Z

I don't agree that the problem is because the room is too dark since our method we have good results. The issue here is that when we implement data augmentation in a dataset without any light manipulation, the results are worse than without any augmentation. Also, when we implement augmentation in conjunction to our method, the results also get worse. Moreover, our method has dark images, and still performs well. I was tweaking the augmentation parameters to try to get similar max and min brightness in dataset generated using the simulator and augmented dataset.

When we apply extreme augmentations, the models usually perform poorly (https://research.unl.pt/ws/portalfiles/portal/44736222/Brightness_as_an_Augmentation_Technique_for_Image_Classification.pdf).

I think Daniel here has a point. I just don't know if it's worth to try to tweak the augmentation parameters or just accept the results that the state-of-the-art method gives, not just for the thesis but also already thinking for the paper.

Table with tests, it's currently a bit confusing, but the first five lines use the same parameters as we were doing previously with 0.5 brightness threshold then a line with 0.3, 0.2 and lastly with a min threshold of 0.5 and max threshold of 2. The last one actually brightness the images, while the other only darkens, since I hadn't understood that yet at the time.