/words2number

R package for converting English number words to numeric digits

Primary LanguageR

words2number: Convert English number words into numeric digits Build Status

Installation

devtools::install_github("benmarwick/words2number")

This is a fork of verajosemanuel/ESmisc, which is for Spanish numbers. I've changed the Spanish to English and made a few other minor modifications.

How to use

The only function in this package, to_number, is a simple function to translate English words for whole number quantities into their numerical counterparts. Given a numerical quantity spelled in English, to_number translates it to an integer.

library(words2number)
to_number("forty two")
## [1] 42
to_number("forty-two")
## [1] 42
to_number("fortytwo")
## [1] 42

It can also handle ordinals:

to_number("forty second")
## [1] 42
to_number("forty-third")
## [1] 43
to_number("fortyfourth")
## [1] 44

Usage notes

  • to_number needs clean text. The input must be previously cleaned with extraneous words and symbols removed.
  • Quantities must be written in simple, common English (this function is very simple).
  • It currently doesn't handle decimals (e.g. "three point one") or fractions (e.g. "three-quarters")
  • The upper limit is up to the millions range.
  • There are some tests in the package already, but this is experimental, and there may be some numbers that it doesn't work on. Please let me know if you find them!

Also relevant

If you want to go the other way, convert digits into English words, you can use the english package. For example, as.character(english::english(5)) will give "five". There is also replace_number() from the textclean package. I took some inspiration for the ordinals from the finnfiddle/words-to-numbers js library.

Contributing

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.