中文分词(粗分)错误:New in version 3.3.
wencan opened this issue · 1 comments
wencan commented
Describe the bug
文本:
New in version 3.3.
https://hanlp.hankcs.com/demos/tok.html?text=New+in+version+3.3.%0A%0A&coarse=true
结尾的.是一个句号。但粗分把 3.3. 放一起了。细分没这问题
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
无
Describe the current behavior
A clear and concise description of what happened.
Expected behavior
A clear and concise description of what you expected to happen.
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
- Python version: 无
- HanLP version: 线上最新
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
- I've completed this form and searched the web for solutions.
hankcs commented
这是英文分词的范畴而不是中文分词的bug。建议使用英文模型,或自定义辞典。