larspars/word-rnn

error in loading glove vectors

pallavi0335 opened this issue · 1 comments

Hi
Whenever I am trying to pass glove file with 150 dimensions,I am getting this error.

th train.lua
loading data files...

cutting off end of data so that the batches/sequences divide evenly
reshaping tensor...
data load done. Number of data batches in train: 1039, val: 184, test: 0

Vocab Size: 7027, Threshold: 10
creating an lstm with 2 layers

loading glove vectors

/opt/torch/install/bin/luajit: bad argument #2 to '?' (out of range at /opt/torch/pkg/torch/generic/Tensor.c:890)

stack traceback:
[C]: at 0x7f4ecca9b8f0
[C]: in function '__index'
./util/GloVeEmbedding.lua:65: in function 'parseEmbeddingFile'
./util/GloVeEmbedding.lua:44: in function '__init'
/opt/torch/install/share/lua/5.1/torch/init.lua:91: in function
[C]: in function 'GloVeEmbeddingFixed'
train.lua:154: in main chunk
[C]: in function 'dofile'
/opt/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405e70

My config file looks like this -

-- model params
opt.rnn_size = 300 --size of LSTM internal state
opt.num_layers = 2 --number of layers in the LSTM
opt.model = 'lstm' --(lstm,gru or rnn)
opt.wordlevel = 1 --(1 for word level 0 for char)

-- optimization
opt.learning_rate = 3e-3 --learning rate
opt.learning_rate_decay = 0.97 --learning rate decay
opt.learning_rate_decay_after = 5 --in number of epochs, when to start decaying the learning rate
opt.decay_rate = 0.95 --decay rate for rmsprop
opt.dropout = 0.35 --dropout for regularization, used after each RNN hidden layer. (0 = no dropout)
opt.seq_length = 80 --number of timesteps to unroll for
opt.batch_size = 10 --number of sequences to train on in parallel
opt.max_epochs =20 --number of full passes through the training data
opt.grad_clip = 3 --clip gradients at this value
opt.train_frac = 0.85 --fraction of data that goes into train set
opt.val_frac = 0.15 --fraction of data that goes into validation set
--test_frac will be computed as (1 - train_frac - val_frac)
opt.init_from = '' --initialize network parameters from checkpoint at this path
opt.optim = 'rmsprop' --which optimizer to use: (rmsprop|sgd|adagrad|asgd|adam)
opt.optim_alpha = 0.8 --alpha for adagrad/rmsprop/momentum/adam
opt.optim_beta = 0.999 --beta used for adam
opt.optim_epsilon = 1e-8 --epsilon that goes into denominator for smoothing

-- bookkeeping
opt.seed = 123 --torch manual random number generator seed
opt.print_every = 1 --how many steps/minibatches between printing out the loss
opt.eval_val_every = 200 --every how many iterations should we evaluate on validation data?
opt.checkpoint_dir = 'cv' --output directory where checkpoints get written
opt.savefile = 'checkpoint' --filename to autosave the checkpont to. Will be inside cv/
opt.threshold = 10 --minimum number of occurences a token must have to be included
--(ignored if -wordlevel is 0)

-- GPU/CPU
opt.backend = 'cpu' --(cpu|cuda|cl)
opt.gpuid = 0 --which gpu to use (ignored if backend is cpu)

-- Glove
opt.glove = 1 --whether or not to use GloVe embeddings
opt.embedding_file = 'util/glove/vectors.txt' --filename of the glove (or other) embedding file
opt.embedding_file_size = 150 --feature vector size of embedding file

-- Sampling Configuration

-- checkpoint
opt.checkpoint = 'lm_lstm_epoch7.56_1.2685.t7' --model checkpoint to use for sampling. If Empty, pulls last checkpoint

Having the same issue.