Incorrect indefinite article "an" returned when handling uncommon abbreviations
tonywu7 opened this issue · 5 comments
Version
inflect==5.3.0
python==3.9.6
Problem
inflect.engine.a()
returns the article "an" instead of "a" for some abbreviations that do not begin with vowel phonemes. Observe:
>>> import inflect
>>> p = inflect.engine()
>>> p.a('JSON code block')
'a JSON code block'
>>> p.a('YAML code block')
'an YAML code block'
>>> p.a('Core ML function')
'an Core ML function'
>>> p.a('TLS connection')
'an TLS connection'
>>> p.a('CSS selector')
'an CSS selector'
Seems to be caused by the A_abbrev
regex:
Lines 1806 to 1813 in 98e19e3
which is missing the ^
start-of-line assertion, and so it matches parts of the abbreviations that would otherwise require "an," even thought they are not at the beginning of the word:
>>> A_abbrev.search('TLS connection')
<re.Match object; span=(1, 3), match='LS'>
After fixing it:
>>> p.a('JSON code block')
'a JSON code block'
>>> p.a('YAML code block')
'a YAML code block'
>>> p.a('Core ML function')
'a Core ML function'
>>> p.a('TLS connection')
'a TLS connection'
>>> p.a('SSL connection')
'an SSL connection'
>>> p.a('RSA algorithm')
'an RSA algorithm'
Seems reasonable. Can you supply a patch?
I've added tonywu7's modification on https://github.com/kimgerdes/inflect (the smallest fork of all times, just a little ^)
@kimgerdes is it okay if I use your fork, and does it have the same license as this repo? I am running into the same issue. Thank you!
sure thing. same license :)
Fixed in 6.0.3