openeventdata/mordecai

Issues with the geoparse prediction for China and U.S

akankshanb opened this issue · 1 comments

  1. The geoparse seems to allocate building as place when China is used in a sentence
    geo.geoparse('We traveled to China')
[{'country_conf': 0.68758196,
  'country_predicted': 'CHN',
  'geo': {'admin1': 'Hubei',
          'country_code3': 'CHN',
          'feature_class': 'S',
          'feature_code': 'SCHC',
          'geonameid': '6620465',
          'lat': '30.52047',
          'lon': '114.39637',
          'place_name': 'China University of Geosciences'},
  'spans': [{'end': 20, 'start': 15}],
  'word': 'China'}]
  1. It predicts Canada as a country if U.S in the sentence is used
    geo.geoparse('We traveled to the U.S')
[{'country_conf': 0.28868943,
  'country_predicted': 'CAN',
  'spans': [{'end': 22, 'start': 19}],
  'word': 'U.S'}]

Kindly look in this. Thanks!

Thanks for the report. Mordecai was mostly built to geolocate subnational locations, but several people have requested this feature so I'm planning to add it in the next re-write.