Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.
Grab the package using pip
(this will take a few minutes)
package not yet availabe on pypi
Import the module, give some text or a URL, and presto.
TODO
LocationExtractor is based on:
LocationExtractor uses the following excellent libraries:
- NLTK for entity recognition
- newspaper for text extraction from HTML
- jellyfish for fuzzy text match
- pycountry for country/region lookups
LocationExtractor uses the following data sources:
- This product includes GeoLite2 data created by MaxMind, available from https://www.maxmind.com
- ISO3166ErrorDictionary for common country mispellings via Sara-Jayne Terp