CUNY-CL/wikipron

what's the difference between 'broad' and 'narrow' tsv?

fake-warrior8 opened this issue · 3 comments

Sorry, I didn't find the description of broad and narrow tsv? Could you tell me what's their difference?

It’s a standard term referring to how precise the transcription is: https://en.wikipedia.org/wiki/Phonetic_transcription#Narrow_versus_broad_transcription

On Sat, Jun 5, 2021 at 7:22 AM LDong @.***> wrote: Sorry, I didn't find the description of broad and narrow tsv? Could you tell me what's their difference? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#425>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OJ23LRJNERIA2KNMQDTRICINANCNFSM46ENVIQQ .

Thank you for your reply! Now I have another question. I'm studying the multilingual G2P system of SIGMORPHON 2020 shared task, which uses scraped dataset of wikipron here. I found that their multilingual dataset includes broad French dataset and narrow Armenian dataset. However, the wiki link you provided shows that 'A further disadvantage of narrow transcription is that it involves a larger number of symbols and diacritics that may be unfamiliar to non-specialists', which means narrow transcription and broad transcription may have some different phoneme tags. Thus, I wonder whether the mixture of broad and narrow transcription will confuse a multilingual G2P system? Or is there some disadvantage when mixing broad and narrow transcription dataset?