jdevera/pylabeador

Bug when dealing with ''c+consonant"

Closed this issue · 3 comments

Hi again, I spotted another bug in the pylabeador program. I'll leave some self-explaining examples below about de management of 'c'+consonant (h seems to work fine).

pylabeador.syllabify('ocre')
['oc', 're']
pylabeador.syllabify('chacra')
['chac', 'ra']
pylabeador.syllabify('Tecla')
['Tec', 'la']
pylabeador.syllabify('aclimatar')
['ac', 'li', 'ma', 'tar']

Thank you very much for your work.
M

This is a very interesting catch, I am delighted with your reports. Thak you!

I went back to the paper and found this paragraph:

The onset of a Spanish syllable, if any, may be composed by a single consonant or by two consonants (Grammar 3). In this case, the first consonant must be an occlusive ‘p’, ‘b’, ‘t’, ‘d’, ‘k’, ‘g’ or fricative-labiodental consonant ‘f’, and the second one must be ‘r’ or ‘l’, taking into account that ‘l’ cannot be preceded by ‘d’ or, in the Spanish used at some geographical areas, by ‘t’ (the group ‘tl’ is not native of Spanish, but it is present in many words from native american languages which have been incorporated to Spanish).

I do believe now that the mentioned 'k' refers to the phoneme, rather than only the letter k, given the examples you've provided, where it is clear that both "cl" and "cr" can be the onset of a Spanish syllable. I checked the original C++ library and it only deals with the letter "k" too.

I'll try extending the rule and see what breaks :D

I agree with you on your last comment about c/k. I tried to test also de the behaviour with the 'v' since in has to match the one of the 'b' and it is not listed in excerpt of the above text.

I'll let you know if I find more bugs in the future.

Thank you for your work
M

I could not think of any Spanish word with vl or vr, though.

Anyway, this is now fixed and v0.5.0 is in PyPI. Thanks again for checking this out, I hope the tool is useful to you :)