VinAIResearch/XPhoneBERT

punctuation missing

thanhlong1997 opened this issue · 1 comments

Hi, I am trying Xphonebert for Vietnamese TTS system and I find that Xphonebert is simply skip punctuation character when convert input sequence to phoneme sequence by using Text2PhonemeSequence library
For example:

import torch
from transformers import AutoModel, AutoTokenizer
from text2phonemesequence import Text2PhonemeSequence

# Load XPhoneBERT model and its tokenizer
xphonebert = AutoModel.from_pretrained("vinai/xphonebert-base")
tokenizer = AutoTokenizer.from_pretrained("vinai/xphonebert-base")

# Load Text2PhonemeSequence
text2phone_model = Text2PhonemeSequence(language="vie-n", is_cuda=False)

# Input sequence that is already word-segmented (and text-normalized if applicable)
sentence1 = "dù sao tiền cũng đã trả rồi, chờ xem phản ứng từ thị trường thế nào đã rồi nói tiếp."
sentence2 = "dù sao tiền cũng đã trả rồi. chờ xem phản ứng từ thị trường thế nào đã rồi nói tiếp."
input_phonemes1 = text2phone_model.infer_sentence(sentence1)
input_phonemes2 = text2phone_model.infer_sentence(sentence2)

The phoneme sequence of 2 input is the same:

z u ˧˨ ▁ s a w ˧˧ ▁ t i ə n ˧˨ ▁ k u ŋ͡m ˧ˀ˥ ▁ d a ˧ˀ˥ ▁ c a ˧˩˨ ▁ z o j ˧˨ ▁ c ɤ ˧˨ ▁ s ɛ m ˧˧ ▁ f a n ˧˩˨ ▁ ɯ ŋ ˨˦ ▁ t ɯ ˧˨ ▁  i ˨ˀ˩ ʔc ɯ ə ŋ ˧˨ ▁  e ˨˦ ▁ n a w ˧˨ ▁ d a ˧ˀ˥ ▁ z o j ˧˨ ▁ n ɔ j ˨˦ ▁ t i ə p ˦˥

This will raise an misunderstanding for model to learn break between sentence parts.
Pls check it out !!!
Thank you !!!

As stated in the Readme file, you have to perform Vietnamese word segmentation.
The input should be: dù_sao tiền cũng đã trả rồi , chờ xem phản_ứng từ thị_trường thế_nào đã rồi nói tiếp .