Typos or common abbrebations (e.g., 'luv'->'love', 'gr8'->'great') are very common in social media such as Facebook, Twitter and Whatsapp. It hinders NL parsers to recognize the syntax and brings many challenges in natural language processing.
This project aims to manually summarize a dictionary of common typos and corresponding corrections.
$ git clone https://github.com/guxd/typo_dict.git
-
typos_en_social.py Typos in social media, e.g., Twitter, Facebook, Whatsapp
-
typos_en_program.py Typos in programming environment, e.g., StackOverflow