根本分不了词啊,那里错了?

Question

根本分不了词啊,那里错了?

Opened this issue 5 years ago · 2 comments

root@wcjs-test:/usr/local/scws/bin# cat a.txt
奔驰 12.0 2.2 n
蓝天 11.2 2.2 n
每日一问 30.1 5.0 nz

root@wcjs-test:/usr/local/scws/bin# ./scws-gen-dict -c utf8 -i a.txt
Output file exists: Success
root@wcjs-test:/usr/local/scws/bin# ./scws -i '奔驰在每日一问里面好像有点厉害了' -c utf8 -d dict.xdb -A -U
奔驰/n 在/un 每日一问/n 里/un 面/un 好/un 像/un 有/un 点/un 厉/un 害/un 了/un
+--[scws(scws-cli/1.2.3)]----------+
| TextLen: 48 |
| Prepare: 0.0002 (sec) |
| Segment: 0.0003 (sec) |
+--------------------------------+

<?php
        $sh = scws_open();
        scws_set_charset($sh, 'utf8');
        scws_set_dict($sh, '/usr/local/scws/bin/dict.xdb');
        //scws_set_rule($sh, '/path/to/rules.ini');
        $text = "奔驰在每日一问里面好像有点厉害了";
        scws_send_text($sh, $text);
        $top = scws_get_result($sh);
        scws_close($sh);
        print_r($top);
?>

Answer 1 · 2023-06-07T16:04:03.000Z

$ cat dict_aa.txt
奔驰 12.0 2.2 n
蓝天 11.2 2.2 n
每日一问 30.1 5.0 nz

$ scws-gen-dict -i dict_aa.txt -o dict_aa.xdb
Reading the input file: dict_aa.txt ...OK, total nodes=10
Optimizing... OK
Dump the tree data to: dict_aa.xdb ... OK, all been done!

$ scws -c utf8 -d dict_jieba1.xdb:dict_aa.xdb -N -i "奔驰在每日一问里面好像有点厉害了"
奔驰在每日一问里面好像有点厉害了

dict_jieba1.xdb 是我用 https://github.com/fxsjy/jieba/tree/master/extra_dict (dict.txt.big和dict.txt.small合併) 編譯的。

Answer 2 · 2023-06-08T08:03:29.000Z

这不是有分词吗？在 2023年6月8日，00:04，Fung Cheok Yin ***@***.***> 写道： $ cat dict_aa.txt 奔驰 12.0 2.2 n 蓝天 11.2 2.2 n 每日一问 30.1 5.0 nz $ scws-gen-dict -i dict_aa.txt -o dict_aa.xdb Reading the input file: dict_aa.txt ...OK, total nodes=10 Optimizing... OK Dump the tree data to: dict_aa.xdb ... OK, all been done! $ scws -c utf8 -d dict_jieba1.xdb:dict_aa.xdb -N -i "奔驰在每日一问里面好像有点厉害了" 奔驰在每日一问里面好像有点厉害了 dict_jieba1.xdb 是我用 https://github.com/fxsjy/jieba/tree/master/extra_dict (dict.txt.big和dict.txt.small合併) 編譯的。 —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>