kentonl/e2e-coref

please make me to understand

Closed this issue · 2 comments

JKP0 commented

what is the meaning of these lines given in the readme
Screenshot (195)

does user needs to do any changes in command to train or test or any specific changers

Linux with GPU RTX 2080 Ti
TensorFlow 1.14.0
python 3.6.8

facing problems in training, the code becomes silent after a few minutes for a long long time

I'm facing the same problem, is there any ideas?

JKP0 commented

@IvyGao58
What has made me come out from the issue
facing problems in training, the code becomes silent after a few minutes for a long long time
is correcting file paths and running the setup_training.sh properly without failure. Failure in running setup_training.sh occurs due to incorrect file paths and failure in accessing Ontonotes-5.0.
If your code stands silently then it is not running, check your json files it may be empty.

about

* It does not use GPUs by default. Instead, it looks for the `GPU` environment variable, which the code treats as shorthand for `CUDA_VISIBLE_DEVICES`.
* The training runs indefinitely and needs to be terminated manually. The model generally converges at about 400k steps.

I don't know; But it identifies GPU automatically.
The issue may have caused because of requirements.txt installs tensorflow-gpu>=1.13.1 (pip version) and your cudda(cudnn) may be incompatible with it.
Remove TensorFlow from requirements.txt and install it vial conda, it may resolve the issue