allo-media/text2num

does not convert simple numbers in English sentences

Closed this issue · 4 comments

Hi,
I am using python 3.7.

text_to_num.alpha2digit("she is one foot taller than me.", "en")
returns 'she is one foot taller than me.', so 'one' is not converted.

if I change it slightly to:
text_to_num.alpha2digit("she is two feet taller than me.", "en")
then it returns 'she is 2 feet taller than me.'

So, a simple conversion does not happen in this case and I don't know why?

thanks!

rtxm commented

one is quite special, as it is also a pronoun:

  • "The first one"
  • "Another one"
  • "No one"

We cannot easily identify between a number and a pronoun when one is isolated, so we only convert it when it is part of a sequence, that is, when another number precedes of follows it.

I'm wondering: maybe we should do that for all single digit numbers, shouldn't we?

I would advise against having all small numbers left in text form, let the user decide how he wants the conversion done. Perhaps via a flag, eg. CMS, could follow the guidelines of the Chicago Manual of Style, https://getitwriteonline.com/articles/using-numbers/. Its absence would have all numbers converted to digits.

@rtxm commented on May 7, 2020, 1:12 PM GMT+4:30:

one is quite special, as it is also a pronoun:

  • "The first one"
  • "Another one"
  • "No one"

We cannot easily identify between a number and a pronoun when one is isolated, so we only convert it when it is part of a sequence, that is, when another number precedes of follows it.

I'm wondering: maybe we should do that for all single digit numbers, shouldn't we?

That's even more "buggy" than this current predicament with one.

rtxm commented

See #42