Add function to read Pretrain word vector
Closed this issue · 1 comments
gau820827 commented
Write a function to load in the dataset and transform to pre-train word embedding.
For example, build an iterator on train, eval, and test set.
train_iter = get_batch(train_set)
a, b, org_a, org_b = next(train_iter)
This should get a batch of pairs a, b where the dimensions are (batch_size, sentence_length, embedding_size). Add paddings to make the sentence length fixed in one batch if it's convenient.
Note that org_a and org_b are original tokens in a list. The dimensions are (batch_size, sentence_length)
o9812 commented
Done