BrikerMan/Kashgari

Transformer Embedding

ahmad-alismail opened this issue · 1 comments

Hello,
Thank you for this amazing job!

In the transformer embedding documentation here you mentioned that by using pre-trained embedding, we have to use the same tokenize tool with the embedding model to access the full power of the embedding.

If we do that, however, a bunch of tokens will be added to the input sequences (special tokens, padding, and splitting "unknown" words) and now the original token labels are no longer aligned with the tokens!

My question is how I can solve this problem using Kashgari framework for sequence labeling task?
Could you provide please another example in the documentation?

Thanks in Advance!

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.