kakaobrain/g2pm

Training Data Explanation

dpny518 opened this issue · 2 comments

if you open up the .lb file there is only one pinyin there, while the corresponding line in .sent file has a string of characters..shouldn't the .lb file also have a string of pronunciation?

I read the paper so it seems in the sentence there is a bunch of characters and one of the characters gets surrounded by "_" and this is the character that has the pinyin for in the .lb file

I read the paper so it seems in the sentence there is a bunch of characters and one of the characters gets surrounded by "_" and this is the character that has the pinyin for in the .lb file