Process KILLED issue in gen_hard_example.py

Question

Process KILLED issue in gen_hard_example.py

Zepyhrus opened this issue 6 years ago · 4 comments

"""
time cost in average1.144 pnet 1.144 rnet 0.000 onet 0.000
boxes length: 12880
finish detecting
save_path is :
../images/no_LM12/RNet
Killed
"""

Hi mate, thanks for your open-source, brilliant work. I got this issue when I was generating training examples for RNet.

I'm using:

tensorflow-gpu==1.12.0;
GPU: Nvidia Quodra P2000, 5G;
Python == 3.6.8;
Ubuntu 18.04;

Any idea of this? May this be related to OOM or how may I solve it?

Answer 1 · 2019-05-25T11:17:17.000Z

This is so frustrating, each time the test part takes around 3 hours and got killed during the pickle.dump steps, is it because of the huge size of detections, which is generated from mtcnn_detector.detect_face(test_data)?

Answer 2 · 2019-06-05T09:27:51.000Z

Problem solved. This is what the author doing: Load images from disk explicitly -> generate the boxes (At this time, all the boxes are cached in memory) -> dump boxes to hard disk for the convenience of debugging -> restore the dumped pickle file for future hard example images saving.

It is so hard to believe that someone would save a intermediate variable to disk just for debugging...

Answer 3 · 2019-07-20T04:15:43.000Z

@Zepyhrus Hey I'm sorry to borther you , but I'm facing the same problem , and I don't know how to fix it , can you please tell me what I should do to solve it ? Thank you so much

Answer 4 · 2019-08-24T02:15:00.000Z

@yaoyao14 Hi, here is the thing, hope this still helps:

you actually don't need to fix it, this is not the right way to write a pipeline as I mentioned above;
If you still want to use the author's pipeline without bugs, you can concatenate box-generating and examples saving together, getting rid of the dump process (or just use some other format, like .csv, instead of .pickle), this is what i did.