problems with model 2 from readme

Question

problems with model 2 from readme

Closed this issue 6 years ago · 2 comments

ottowg commented 6 years ago

some errors in tokenizer used in allennlp==2.1.0 is this the right version in requirements?
missleading text in README: model 2 is only running on GPU because of allennlp if I am right. Specification makes no sense

#if using a CPU, set
export CUDA_DEVICE=-1

#if using a GPU, set
export CUDA_DEVICE=0 #or cuda device id

missleading title of model? This model is not a LSTM model I think (Parikh et al 2016) (same in fever dataset paper)

Answer 1 · 2018-04-10T16:52:34.000Z

Hi @ottowg thanks for getting in contact and raising this. I'll address each issue individually:

I'll look into this - it's something I have experienced this bug with the verison of allennlp. I have included a special tokenizer that works I'll have to double check the configuration files are correct though and that one is being used.
Both models can take advantage of the GPU for both training and predictions. The AllanAI model off the shelf supports GPU and I have added that environment variable to allow easy overriding of that parameter. For the (Riedel et al, 2017) MLP model I wrote this so that that GPU can also be used. The speed-up at prediction time is negligible and both models will predict OK if you use the pre-trained model on a laptop for instance. But when training - the DA model takes hours instead of days and the MLP takes a few minutes to train instead of an hour.
This is an error - we had originally had an LSTM model which we didn't publish. When we changed to DA, it looks like we didn't correct everything! Glad that we spotted this in the draft before we submitted our camera ready version.

-J

Answer 2 · 2018-04-11T18:55:03.000Z

The issue with the model is that the wrong version of AllenNLP was used - we wrote this after 0.2.1 was released and before 0.2.3 was released - upgrading to 0.2.3. broke the build so I've had to fix that and now using 0.2.3 appears to work. Just updating the documentation. But this should now be resolved.

J