Add new phrase cut into multiple phrases
Opened this issue · 13 comments
Like #98.
Adding new phrase will cut into more than 1 phrase, and also contains bopomofo.
ie. When I add this
Phrase | 歐你媽個頭 |
Bopomofo | ㄡ ㄋ一ˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ |
It will split into multiple phrase, like the right part of below figure.
More addition, the new phrase contains bopomofo.
Is this the correct behavior or something got wrong?
@qas612820704 , use chewing-editor -d
to dump and analyze the log. Always attach text messages.
Debug: Add "歐你媽個頭" ( "ㄡ ㄋ一ˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ" ) ((null) :0)
Debug: [chewingio.c:1996 chewing_userphrase_add] API call: ((null) :0)
Warning: chewing_userphrase_add() returns 0 ((null) :0)
Debug: [chewingio.c:1859 chewing_userphrase_enumerate] API call: ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: 歐 ㄡ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: 你媽 ㄋㄧˇ ㄇㄚ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: 個頭 ㄍㄜ˙ ㄊㄡˊ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄡ ㄡ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄋ一ˇ ㄋ ㄧ ˇ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄚ ㄚ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄇㄚ ㄇ ㄚ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄜ ㄜ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄍㄜ˙ ㄍ ㄜ ˙ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: [chewingio.c:1936 chewing_userphrase_get] API call: ((null) :0)
Debug: Get userphrase: ㄊㄡˊ ㄊ ㄡ ˊ ((null) :0)
Debug: [chewingio.c:1887 chewing_userphrase_has_next] API call: ((null) :0)
Debug: Total userphrase 10 ((null) :0)
Debug: 10 ((null) :0)
It looks like same issue of #206.
I will trace the source code.
會發生這個例外,應該是您把「注音」的「ㄧ」,輸成「中文單字」的「一」。
您可以再確認一下,上面的「ㄋ一ˇ」,是「中文單字」的「一」。
注音: ㄧ
U+3127
http://www.fileformat.info/info/unicode/char/3127/index.htm
單字: 一
U+4e00
http://www.fileformat.info/info/unicode/char/4e00/index.htm
以上提供參考
:-)
補充一下,我測試的環境
- Xubuntu 16.04 amd64
執行
$ dpkg -l '*chewing*'
顯示
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================-====================-====================================================================
ii chewing-editor 0.0.1-3 amd64 user dictionary editor for the chewing input method
ii fcitx-chewing 0.2.2-1 amd64 Fcitx wrapper for Chewing library
ii hime-chewing:amd64 0.9.10+git20150916+d amd64 support library to use Chewing in HIME
un libchewing <none> <none> (no description available)
un libchewing-data <none> <none> (no description available)
un libchewing-dev <none> <none> (no description available)
un libchewing1-dev <none> <none> (no description available)
un libchewing2-dev <none> <none> (no description available)
ii libchewing3:amd64 0.4.0-4 amd64 intelligent phonetic input method library
ii libchewing3-data 0.4.0-4 all intelligent phonetic input method library - data files
ii libchewing3-dev 0.4.0-4 amd64 intelligent phonetic input method library (developer version)
un scim-chewing <none> <none> (no description available)
我使用「chewing-editor -d」來測試,
輸入
phrase = "歐你媽個頭"
bopomofo = "ㄡ ㄋ一ˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ"
得到下面的結果
Debug: Add "歐你媽個頭" ( "ㄡ ㄋ一ˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ" ) ((null) :0)
Debug: [chewingio.c:1998 chewing_userphrase_add] API call: ((null) :0)
Warning: chewing_userphrase_add() returns 0 ((null) :0)
輸入
phrase = "歐你媽個頭"
cbopomofo = "ㄡ ㄋㄧˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ"
得到下面的結果
Debug: Add "歐你媽個頭" ( "ㄡ ㄋㄧˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ" ) ((null) :0)
Debug: [chewingio.c:1998 chewing_userphrase_add] API call: ((null) :0)
Debug: [userphrase-sql.c:179 LogUserPhrase] userphrase 歐你媽個頭, phone = 0x0040 0x0e83 0x0608 0x1219 0x0c42 , orig_freq = 1, max_freq = 1, user_freq = 1, recent_time = 58958 ((null) :0)
Debug: "歐你媽個頭 (ㄡ ㄋㄧˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ)" ((null) :0)
然後下載「chewing-editor」的「Source Package」來觀看,
$ apt-get source chewing-editor
執行
$ grep 'checkBopomofo' chewing-editor-0.0.1/* -R
沒有顯示
執行
$ grep 'UserphraseModel::add' chewing-editor-0.0.1/* -R -A 18
顯示
chewing-editor-0.0.1/src/model/UserphraseModel.cpp:void UserphraseModel::add(std::shared_ptr<QString> phrase, std::shared_ptr<QString> bopomofo)
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-{
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- add(*phrase.get(), *bopomofo.get());
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-}
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-void UserphraseModel::importUserphrase(std::shared_ptr<UserphraseImporter> importer)
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-{
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- size_t old_count = userphrase_.size();
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- if (!importer.get()->isSupportedFormat()) {
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- emit importCompleted(false, importer.get()->getPath(), 0, old_count);
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- return;
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- }
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- auto result = importer.get()->getUserphraseSet();
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- for (auto& i: result) {
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- add(i.phrase_, i.bopomofo_);
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- }
--
chewing-editor-0.0.1/src/model/UserphraseModel.cpp:void UserphraseModel::add(const QString &phrase, const QString &bopomofo)
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-{
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- auto ret = chewing_userphrase_add(
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- ctx_.get(),
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- phrase.toUtf8().constData(),
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- bopomofo.toUtf8().constData());
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- if (ret > 0) {
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- emit beginResetModel();
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- userphrase_.insert(Userphrase{
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- phrase,
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- bopomofo
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- });
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- emit endResetModel();
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- emit addNewPhraseCompleted(userphrase_[userphrase_.size()-1]);
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- } else {
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- qWarning() << "chewing_userphrase_add() returns" << ret;
chewing-editor-0.0.1/src/model/UserphraseModel.cpp- }
chewing-editor-0.0.1/src/model/UserphraseModel.cpp-}
看起來我目前使用「chewing-editor」的這個版本「0.0.1-3」,應該是還沒有修正前的版本。
然後我也有測試「libchewing3」,結果也是相同的,
輸入
phrase = "歐你媽個頭"
bopomofo = "ㄡ ㄋ一ˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ"
執行「chewing_userphrase_add」會回傳「0」。
輸入
phrase = "歐你媽個頭"
cbopomofo = "ㄡ ㄋㄧˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ"
執行「chewing_userphrase_add」會回傳「1」。
關於 #206 我有測過,應該也是同樣的情形。
單字
phrase = "鞭數十"
bopomofo = "ㄅ一ㄢ ㄕㄨˋ ㄕˊ"
注音
phrase = "鞭數十"
bopomofo = "ㄅㄧㄢ ㄕㄨˋ ㄕˊ";
報告完畢
:-)
After #210, this issue should be solved now, @qas612820704 can you try again for this issue?
And thanks for the help, @samwhelp, the auto-conversion is published after 0.1.1
.
BTW, we still need a good solution to #108.
Hi @david50407 , @samwhelp is right.
I typos 一 as ㄧ.
Changing
phase = 歐你媽個頭
bopomofo = ㄡ ㄋ一ˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ
into
phase = 歐你媽個頭
bopomofo = ㄡ ㄋㄧˇ ㄇㄚ ㄍㄜ˙ ㄊㄡˊ
works fine. Thx.
@david50407, and that right, #169
Two "一" are U+3127
at
// src/model/UserphraseModel.cpp:197
QString UserphraseModel::checkBopomofo(const QString &bopomofo) const
{
...
replaceBopomofo.replace(QString::fromUtf8("ㄧ"),QString::fromUtf8("ㄧ"));
...
}
needs change to
// src/model/UserphraseModel.cpp:197
QString UserphraseModel::checkBopomofo(const QString &bopomofo) const
{
...
replaceBopomofo.replace(QString::fromUtf8("一"),QString::fromUtf8("ㄧ"));
...
}
Change the first "ㄧ"(U+3127
) into "一"(U+4E00
)
Should I make a pull request to fix it?
@qas612820704, The idea of your preliminary work is to implement fuzzy match logic, which is worthy for sending pull request(s). Can you improve it by accepting more characters such as ㄚ
?
@qas612820704 @jserv, I already fixed that at #210 (and merged) yesterday, and I don't think english charecter Y
is that easily to be mistaken here.
一
and 丫
look same as ㄧ
and ㄚ
in the IME input box under some fonts, so I think just take these two cases is fine.
I defer to @david50407 for the idea not to take alphabet Y
into consideration.