zhaw/neural_style

There's dark band on the right side of the output image

wlxyhy opened this issue · 9 comments

While processing faces, I found that on the right side of the output image, there's a narrow bind which is darker than nearby places. That's especially obvious for tmp0.jpg, see the picture below for detail.

Another problem: I trid to execute train.py in "perceptual" folder to train models,but encountered this:

import train as tr
tr.train_style(0.3,'models/s5','style/picasso.jpg')
[17:49:48] /home/chaoxin/mxnet/dmlc-core/include/dmlc/logging.h:235: [17:49:48] /home/chaoxin/mxnet/mshadow/mshadow/./random.h:328: Check failed: (status) == (CURAND_STATUS_SUCCESS) CURAND Gen Normal float failed. size = 7776,mu = 0,sigma = 1e-08
[17:49:48] /home/chaoxin/mxnet/dmlc-core/include/dmlc/logging.h:235: [17:49:48] src/engine/./threaded_engine.h:306: [17:49:48] /home/chaoxin/mxnet/mshadow/mshadow/./random.h:328: Check failed: (status) == (CURAND_STATUS_SUCCESS) CURAND Gen Normal float failed. size = 7776,mu = 0,sigma = 1e-08
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
terminate called after throwing an instance of 'dmlc::Error'
what(): [17:49:48] src/engine/./threaded_engine.h:306: [17:49:48] /home/chaoxin/mxnet/mshadow/mshadow/./random.h:328: Check failed: (status) == (CURAND_STATUS_SUCCESS) CURAND Gen Normal float failed. size = 7776,mu = 0,sigma = 1e-08
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

What's the matter?

zhaw commented

About the dark band problem, I have ran into the same problem before. According to my observation, dark band will appear at the bottom of the image once the resolution is too high. I forget how I solved it on my computer, but if I remember correctly, it's skimage's problem. Try upgrade skimage and see if it works. I'm not sure about it because I can't reproduce the problem now, sorry about that.

The second problem. It seems that mxnet couldn't generate random number using CURAND. Try

import mxnet as mx
a = mx.random.Normal(0, 1e-8, [7776], mx.gpu())

Does the same error happen?

That does not solve the problem, almost the same errors

a=mx.random.normal(0, 1e-8, [7776], mx.gpu())

[09:49:00] /home/chaoxin/mxnet/dmlc-core/include/dmlc/logging.h:235: [09:49:00] /home/chaoxin/mxnet/mshadow/mshadow/./random.h:328: Check failed: (status) == (CURAND_STATUS_SUCCESS) CURAND Gen Normal float failed. size = 7776,mu = 0,sigma = 1e-08
[09:49:00] /home/chaoxin/mxnet/dmlc-core/include/dmlc/logging.h:235: [09:49:00] src/engine/./threaded_engine.h:306: [09:49:00] /home/chaoxin/mxnet/mshadow/mshadow/./random.h:328: Check failed: (status) == (CURAND_STATUS_SUCCESS) CURAND Gen Normal float failed. size = 7776,mu = 0,sigma = 1e-08
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
terminate called after throwing an instance of 'dmlc::Error'
what(): [09:49:00] src/engine/./threaded_engine.h:306: [09:49:00] /home/chaoxin/mxnet/mshadow/mshadow/./random.h:328: Check failed: (status) == (CURAND_STATUS_SUCCESS) CURAND Gen Normal float failed. size = 7776,mu = 0,sigma = 1e-08
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

zhaw commented

That means there is something wrong with cuda or mxnet. Sorry I can't help with that, it never happens to me before.

I tried to reinstall mxnet, and it could train models.
But the dark band problem still exists

zhaw commented

Have you upgraded skimage?
To make sure if it is caused by skimage, you can try to use crop_img function to process some image and save the result. If dark band shows, then it is skimage's problem.
To find out where goes wrong, you can try to save output image after each step and see where dark band first appear.

scikit-image version:0.12.3 ,it seems that's the new version. The dark band was discovered in tmp0.jpg

zhaw commented

There are many steps before we get the output is generated. First we load the content image, then we crop and resize it to target size, then we do histogram equalization to enhance it, then we copy it into GPU and start optimization. tmp0.jpg has dark band just tells us that there are dark band after optimization. But which step causes dark band? If you don't provide more information, I can't help you. Also, I can't find it for you because I can't reproduce this problem.

I find that if I train models first, and then produce styled-images, the dark band will disappear