How to run your code?

Question

How to run your code?

John1231983 opened this issue 7 years ago · 20 comments

John1231983 commented 7 years ago

Thanks for your working in Deeplab. Could you tell me how to prepare dataset and run your code? Thanks

Answer 1 · 2017-12-19T11:48:22.000Z

Hello @John1231983 , this repository is still under development. I will release a blog post with all relevant information when it is done. Thanks for your message.

Answer 2 · 2017-12-27T00:01:32.000Z

Hi. I found one missing in your ASPP. The author used Batchnorm and train it in ASPP,but you did not. The main contribution of the deeplab version is that training BN in ASPP

Answer 3 · 2018-01-02T02:44:46.000Z

Hi. Could you tell me what is mIoU in validation set that you achieved? I just achieved around 73%

Answer 4 · 2018-01-04T11:13:27.000Z

Hello, if you take a look at the convolution arg_scope, you will see that batch_normalization is there for every convolution op. So I do not need to specify it. It is there.

mIoU stands for Mean Intersection Over Union, and it is an evaluation metric. I haven't trained it yet, through.

Answer 5 · 2018-01-18T18:28:56.000Z

Great work. I am very happy to see your process. Could you also please add the script to generate tfrecord file and the link to download pretrain model from slim team? I am waiting for tfrecord generator

Answer 6 · 2018-01-23T10:16:43.000Z

Thanks for you input @John1231983. I am finishing training right now, i the next hours I will be posting the rest of the code along side with the blog post.

Answer 7 · 2018-01-23T11:15:22.000Z

Great. One more thing, if you use adam method, instead of original paper optimization. You should use a smaller learning rate. And let try with batch size of 8 because of memory limittation

Answer 8 · 2018-01-23T13:56:02.000Z

And also please provide the mIoU in validation set if it is possible. I want to compare with my current performance. I only got 74% after 120k iterations

Answer 9 · 2018-01-26T14:35:51.000Z

Please check the --batch_norm_decay. It is 0.9997 instead of 0.997. Btw, what performance mIoU did you achieve?

Answer 10 · 2018-01-29T14:23:14.000Z

Thanks for update. It looks that you used resnet 50 for this validation. Could you use resnet101 and provide us the result. The resnet 50 does not investigate in the paper for ASPP, so we cannot guarantee that the reproduce is near the paper performance

Answer 11 · 2018-01-29T15:00:55.000Z

Hello, yes I used ResNet50. I plan to use the ResNet101 to see the results. As the paper states in various comparisons, increasing the model capacity would result in slightly better results. I will run it and update the README.

Answer 12 · 2018-01-29T15:08:04.000Z

Great. One more thing, the paper used momentum optimization with learning rate of 7e-3 , while you used adam optimization. As my experimentation, momentum is better for deeplab. Have you try with it before? I think learning rate schedule is also importance for performance, hence follows the paper rule may help achieve close performance with the report

Answer 13 · 2018-01-29T15:40:22.000Z

I did not try with Momentum. However, Adam performs learning rate decay within its computation. It's a good thing to test with momentum though, if I have a chance I will do it. Thanks for the input.

Answer 14 · 2018-01-29T19:10:56.000Z

Good job. I think for people who are not family tfrecord it is better to provide the script to make tfrecord file.

Answer 15 · 2018-01-30T11:39:41.000Z

Hello. i got the error with test phase

Traceback (most recent call last):
  File "test.py", line 143, in <module>
    label_image = np.reshape(label_image, (heights[i], widths[i]))
  File "/home/john/.local/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 257, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "/home/john/.local/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 52, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
ValueError: cannot reshape array of size 170165 into shape (362,500)

Answer 16 · 2018-01-30T11:48:52.000Z

please, make sure you have the latest snapshot from the repo. I believe if you update the code, it will desapear.

Answer 17 · 2018-01-30T11:55:15.000Z

I am using the last one. I just clone it 5 hours ago. Btw, the model looks overfiting when i train on resnet101 v2

Answer 18 · 2018-01-30T12:06:21.000Z

So, regarding the error, your TfRecord must contain the information regarding the image, the annotation and the real height and width. What is happening is that you are trying to reshape an image to the wrong shape. Regarding resnet101, you gonna need to adjust the hyper-parameters, since the once there were used for resnet50.

Answer 19 · 2018-01-30T12:19:21.000Z

Could you provide your script to generate tfrecord file? Thanks

Answer 20 · 2018-01-30T12:53:53.000Z

It is there.