ChineseHeadFinder: dictionary key 'INTJ' repeated with different values
tanloong opened this issue · 3 comments
tanloong commented
AngledLuffa commented
Clearly a bug, as it is clobbering the old entry, which was
nonTerminalInfo.put("INTJ", new String[][]{{right, "INTJ", "IJ", "SP"}});
The new entry makes it left headed (except for punct). Do you have any insight into which is better?
In CTB 5.1, all INTJ
nodes are for single words, such as
(INTJ (IJ 唉呀))
except for this, which would appear to be a mistake based on the bracketing of the punctuation:
(IP
(INTJ (PU 「) (IJ 嘿咻))
(PU !)
(PU 」)
...
I don't have CTB 9 lying around, but I will ask the people in charge of such things to put it on our cluster.
tanloong commented
Thanks for the quick response!
I must admit that I don't have prior knowledge about CTB (and I don't have the CTB 9 neither). Therefore, I am unable to determine which value is better😔.
AngledLuffa commented
Ok, I can see why the new rule is left headed. There are a bunch of INTJ
like this:
嗯 嗯 嗯
also one of
我的天 哪
and a whole lot where the entire sentence is a single word with a
punctuation, and it is all included in the INTJ
Guess the right answer is to just get rid of the old rule. Thanks for
pointing this out!
…On Thu, Jul 6, 2023 at 4:37 PM TAN Long ***@***.***> wrote:
Thanks for the quick response!
I must admit that I don't have prior knowledge about CTB (and I don't have
the CTB 9 neither). Therefore, I am unable to determine which value is
better😔.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWMWXUFJQGDLO4HTEGTXO4OW7ANCNFSM6AAAAAA2A2G5YY>
.
You are receiving this because you commented.Message ID:
***@***.***>