rguthrie3/BiLSTM-CRF

pytorch impl and forward optimization

joelkuiper opened this issue · 1 comments

Hey,

Just saw that you ported this as a tutorial for PyTorch http://pytorch.org/tutorials/beginner/nlp/advanced_tutorial.html#implementation-notes, and it looks great!

Now I'm curious about the snippet

"The implementation is not optimized. If you understand what is going on, you’ll probably quickly see that iterating over the next tag in the forward algorithm could probably be done in one big operation. I wanted to code to be more readable. If you want to make the relevant change, you could probably use this tagger for real tasks."

Seems like this implementation is the same, did you ever find a way to optimize it? Would save me the trouble, since I'm still very new to PyTorch (unfortunately, static models don't cut it for my use case at the moment …)

Hi, this repo in Dynet was code that I was using for research (I wasn't just implementing the model for fun, which is why there is some dead code and stuff I haven't bothered to clean up), and the way I have it implemented here was fast enough for me. If it is fast enough for you too, then there is no need to optimize it. I was mainly referring to people that wanted to use it in more time-sensitive applications. Basically, there are points in the code where I iterate over the set of next tags. This is essentially equivalent to iterating over rows of a certain matrix. Rather than iterate over the rows of that matrix, you could just do it as a matrix operation, which in Dynet and Pytorch would be a C / C++ function call with an optimized implementation.

Like I said, unless the current implementation is too slow for you (it wasn't for me), there isn't much to worry about.