infinilabs/analysis-pinyin

中英文混合时能否也支持下提取英文单词首字母

Opened this issue · 0 comments

目前如果是中英文混合的情况下,只能对中文取首字母,英文还是完整单词。
比如:

GET /tests/_analyze
{
  "text": "我是谁 where am i",
  "tokenizer": {
    "type": "pinyin",
    "limit_first_letter_length": 64,
    "keep_full_pinyin": false,
    "keep_first_letter": true,
    "keep_none_chinese": false,
    "keep_none_chinese_together": true,
    "keep_none_chinese_in_first_letter": true,
    "none_chinese_pinyin_tokenize": true,
    "lowercase": false,
    "keep_original": false
  }
}

这会返回 token: wsswhereami
能否支持下返回 wsswai