wordipy is dummy package which can be used in NLP related applications for example to get abbrevations, tupled things, amounts written in letter to numbers.
The following needs to be installed if you dont have it.
- Python 3 (3.7 or higher)
- NLTK
pip3 install --user -U nltk
or
conda install -c anaconda nltk
- word2number
pip3 install word2number
or
conda install -c conda-forge word2number
Clone this repository
git clone https://github.com/amitjslearn/wordipy.git
Open the terminal, cd into this repo directory
(or command prompt)(not sure for windows)
python3 setup.py install
Open python in terminal
>>> from wordipy import abbreviationize
>>> abbreviationize("C M")
'CM'
>>> abbreviationize("Triple A")
'AAA'
>>> abbreviationize('Double Bam', sep=",")
'Bam, Bam'
for more help type
help(abbreviationize)
>>> help(abbreviationize)
>>> from wordipy import amountize
>>> amountize("three hundred dollars")
'$300'
>>> amountize("two yen")
'¥2'
for more help type
help(amountize)
>>> help(amountize)
-
abbreviationize(): input to abbreviationize should be a capital like "C M" to get the output as CM for ex "c m" will not give "CM". We can take care of it.
-
amountize(): the amount which is already a number can be retained as it is
-
Stop words can be removed if needed
-
Lemmatization and stemming can be done properly wherever required
-
Dollars can also be usd, likewise for rupees can be inr (or even Rs) etc. So we can add synonyms for currency names
Note
- Only most commonly used tuples (like single, double, triple, quadruple) are used, other tuples can be added as per the needs.
- Only some of the currencies are added (like dollar, yen) other currencies can be added if required.
- The amount/numbers given should be positive numbers upto the range of 999,999,999,999 (i.e. billions).