crf score

Question

crf score

JingliSHI0206 opened this issue 4 years ago · 5 comments

tagTransScoresMiddle = torch.gather(currentTagScores[:, 1:, :], 2, tags[:, : sentLength - 1].view(batchSize, sentLength - 1, 1)).view(batchSize, -1) (linear_rf_inferencer.py)

sorry for bothering you again.
why second index start at 1 for currentTagScores ? because first token (bert embedding) is '[CLS]' ?

THANK YOU.

Answer 1 · 2020-09-24T15:34:05.000Z

Sorry for the late reply.
Conceptually, we are working on word-level sequence in the CRF stage.

Because:

we already extract the "word-level representation" from BERT/Roberta.
a. this part of the code to memorize the first workpiece of every word
https://github.com/allanj/pytorch_lstmcrf/blob/master/config/transformers_util.py#L67-L76
b. this part of code using the information from a to extract the word-level information
https://github.com/allanj/pytorch_lstmcrf/blob/master/model/embedder/transformers_embedder.py#L51
So there is no CLS/SEP, wordpiece, BPE tokens involves in the CRF layers. In the CRF, it's same as the traditional LSTM-CRF, where each position is the current word.

Back to this line of code:
it is quite tricky to answer, I would also like to ask to try and debug.
Conceptually:
currentTagScores with size: (batch, seq_len, label_size) means that currentTagScores[b, i, :] represents the transition scores to the i^th position.

currentTagScores[b, i, j] represents the transition scores to the i^th position with j^th label.

Thus, tagTransScoresMiddle actually calculate the middle scores. For example, the blue edges above.
because we need currentTagScores[b, 1, B] + currentTagScores[b, 2, C] + + currentTagScores[b, 3, B]

from 0->1, it's the start score.

Answer 2 · 2020-09-24T20:46:27.000Z

Thank you so much for the detailed explanation. really appreciate it.

Answer 3 · 2020-11-22T16:35:11.000Z

Hi Allen,
Thank you so much for all your reply to my questions about lstmcrf project.

Actually, I started my PhD in New Zealand around April 2020 and my research area is NLP. Now I just come back Singapore last month. Hope to have a chance to meet with you to hear any advice about phd study of NLP.
please just ignore such an offensive request if it is not convenient for such a meetup.

hope to hear from you shortly, and stay well and stay safe.
thank you,
Jingli

Answer 4 · 2020-11-23T12:15:10.000Z

Sure. We can grab a coffee sometime.

On Mon, 23 Nov 2020 at 12:35 AM, JingliSHI0206 ***@***.***> wrote: Hi Allen, Thank you so much for all your reply to my questions about lstmcrf project. Actually, I started my PhD in New Zealand around April 2020 and my research area is NLP. Now I just come back Singapore last month. Hope to have a chance to meet with you to hear any advice about phd study of NLP. please just ignore such an offensive request if it is not convenient for such a meetup. hope to hear from you shortly, and stay well and stay safe. thank you, Jingli — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZSFEYWBF22ZWRGGTQGROTSRE4UXANCNFSM4RXTURVQ> .

-- NLP Scientist, ByteDance AI Lab <https://ailab.bytedance.com/>, Singapore

Answer 5 · 2020-11-24T16:47:11.000Z

Sure. We can grab a coffee sometime.
On Mon, 23 Nov 2020 at 12:35 AM, JingliSHI0206 @.***> wrote: Hi Allen, Thank you so much for all your reply to my questions about lstmcrf project. Actually, I started my PhD in New Zealand around April 2020 and my research area is NLP. Now I just come back Singapore last month. Hope to have a chance to meet with you to hear any advice about phd study of NLP. please just ignore such an offensive request if it is not convenient for such a meetup. hope to hear from you shortly, and stay well and stay safe. thank you, Jingli — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZSFEYWBF22ZWRGGTQGROTSRE4UXANCNFSM4RXTURVQ .
-- NLP Scientist, ByteDance AI Lab https://ailab.bytedance.com/, Singapore

that's great. Maybe we can catch coffee next week.
now i continue my research work at home, so just let me know the coffee time at your convenience.
my HP is 9865-8766 and we can decide the time on whatsapp.