Documentation for the tagset used by the POS tagger of Sudachi
BLKSerene opened this issue · 5 comments
BLKSerene commented
Hi, I'm wondering that if there are any documentation for the tagset used by the POS tagger of Sudachi?
kazuma-t commented
See the following thread on Slack.
https://sudachi-dev.slack.com/archives/CBCF278AC/p1617584539086100
BLKSerene commented
Thanks, but neither the pdf link nor the Python code snippet works now.
eiennohito commented
If you are using 0.6.2, the following snippet should work
import sudachipy
sudachi_dic = sudachipy.Dictionary()
matcher = sudachi_dic.pos_matcher([()])
for pos_id, pos in enumerate(matcher):
print(pos_id, ",".join(pos), sep="\t")
kazuma-t commented
It seems that the BCCWJ manual has been moved here.
https://ccd.ninjal.ac.jp/bccwj/doc/manual/BCCWJ_Manual_05.pdf
BLKSerene commented
Thanks!