The data is provided as a list of tweet ids accompanied by a list of 2-tuples (location phrase, location type) for the three benchmark subsets in the corresponding folders (COVID and MIXED) The preprocess.py file contains pre-processing script to clean, anonymize and tokenize the tweets. A preview of annoation tool is provided as an HTML file.