django-cities returns a MultipleObjectsReturned exception raised while importing
mayela opened this issue · 22 comments
Checklist
- I have verified that I am using a GIS-enabled database, such as PostGIS or Spatialite.
- I have verified that that issue exists against the
master
branch of django-cities. - I have searched for similar issues in both open and closed tickets and cannot find a duplicate.
- I have reduced the issue to the simplest possible case.
- I have included a failing test as a pull request. (If you are unable to do so we can still accept the issue.)
Steps to reproduce
python project/manage.py cities --import=all
Expected behavior
Sucessful import
Actual behavior
Traceback (most recent call last):
File "project/manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/django/core/management/__init__.py", line 363, in execute_from_command_line
utility.execute()
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/django/core/management/__init__.py", line 355, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/django/core/management/base.py", line 283, in run_from_argv
self.execute(*args, **cmd_options)
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "/usr/lib/python3.4/contextlib.py", line 30, in inner
return func(*args, **kwds)
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/cities/management/commands/cities.py", line 160, in handle
func()
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/cities/management/commands/cities.py", line 1005, in import_postal_code
region__country=pc.country)
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/vagrant/environment/venv/lib/python3.4/site-packages/django/db/models/query.py", line 384, in get
(self.model._meta.object_name, num)
cities.models.MultipleObjectsReturned: get() returned more than one Subregion -- it returned 2!
I am also encountering this issue. As you can see from the traceback above, the problem is that there are Subregion objects that are non-unique by a combination of region name, subregion name, and country. I've determined that there are, in fact, 56 subregion names that are non-unique for their countries in the data, they are listed below (region id, subregion name, number of matching Subregion objects). These represent 119 Subregion objects, withe the following ids.
[10402309, 2240688, 7839621, 7839620, 6158841, 6158840, 9179465, 8260564, 10195120, 10192083, 10195161, 10192085, 10193946, 10193226, 10195159, 10193630, 10195184, 10194916, 10201610, 10201580, 10346907, 10346906, 10347016, 10346968, 10347121, 10347095, 2413641, 2413640, 449616, 66575, 449585, 68471, 449494, 122490, 1482479, 1158999, 11495044, 11495013, 11495327, 11495315, 2111742, 2111741, 11497558, 2129758, 1521359, 1521352, 7669283, 2276212, 8051342, 7910308, 1736645, 1732562, 11600787, 11600418, 11600445, 11600442, 11600460, 11600452, 11600486, 11600458, 11607562, 11606028, 11600691, 11600481, 11608001, 11600482, 11608044, 11600488, 11600492, 11607518, 11608040, 11608123, 11606129, 11600493, 11605860, 11600687, 11608422, 11607999, 11601327, 11608130, 11606060, 11608034, 11606251, 11606747, 11606700, 11607574, 11607526, 11607531, 11607528, 11621653, 11607570, 11607567, 11607566, 11607991, 11607571, 11608088, 11607985, 11608073, 11608033, 11608423, 11608421, 11621659, 11621593, 11621643, 11621628, 7530824, 6690157, 7530825, 6690163, 11496463, 1619260, 7870615, 2473636, 7870604, 2473663, 9197251, 699293, 1538277, 1514190]
(2239858, 'Kiwaba Nzoji'): 2
(2058645, 'Narrogin'): 2
(6093943, 'Sudbury'): 2
(209610, 'Maniema'): 2
(146267, 'Ágios Epifánios'): 2
(146267, 'Kaló Chorió'): 2
(146267, 'Agía Marína'): 2
(146267, 'Ágios Geórgios'): 2
(146267, 'Ágios Theódoros'): 2
(146615, 'Ágios Andrónikos'): 2
(146398, 'Kivisíli'): 2
(146383, 'Vása'): 2
(146213, 'Filoúsa'): 2
(2412353, 'Dappo'): 2
(128231, 'Shahrestān-e Sīrjān'): 2
(134766, 'Shahrestān-e Fasā'): 2
(418862, 'Shahrestān-e Naţanz'): 2
(1159456, 'Shahrestān-e Īrānshahr'): 2
(3488715, 'Gibraltar'): 2
(3488081, 'Amity'): 2
(2112669, 'Naka-gun'): 2
(2130037, 'Kamikawa-gun'): 2
(1519367, 'Lenīn Aūdany'): 2
(2278292, 'Jorquelleh'): 2
(2380635, 'Aleg'): 2
(1733039, 'Bahagian Pantai Barat'): 2
(11205571, 'Raghunāthpur'): 2
(11205571, 'Chyuṭāhā'): 2
(11205571, 'Kachorwā'): 2
(11205571, 'Pipari̇̄yā'): 2
(11205571, 'Sundarpur'): 4
(11205571, 'Paḍari̇̄yā'): 2
(11205571, 'Kabilāsi̇̄'): 2
(11205571, 'Aurahi̇̄'): 4
(11205571, 'Moti̇̄pur'): 2
(11205571, 'Basantapur'): 2
(11205571, 'Maheshpur'): 3
(11205571, 'Duhabi̇̄'): 2
(11205571, 'Lakṣmi̇̄pur'): 2
(11205571, 'Barāhakṣetra'): 2
(11205571, 'Kochābakhāri̇̄'): 2
(11205571, 'Barahi̇̄ Birpur'): 2
(11205571, 'Dharmapur'): 4
(11205571, 'Piprā'): 2
(11205571, 'Hanumānnagar'): 2
(11205571, 'Arnamā'): 2
(11205571, 'Barchhawā'): 2
(11205571, 'Basbiṭṭi̇̄'): 2
(11205571, 'Gopālpur'): 2
(858786, 'Powiat nowosądecki'): 2
(858786, 'Powiat tarnowski'): 2
(1607530, 'Amphoe Bang Sai'): 2
(2473637, 'Kef Est'): 2
(2472770, 'Nefza'): 2
(696634, 'Novosanzhars’kyy Rayon'): 2
(1484842, 'Chust Tumani'): 2
Thank you for reporting this, I'll add that data to the tests and fix it when I can. I don't have a lot of free time at the moment.
I also encounter the same problem.
same here
@blag I can help sanding the test, but about the info given by @joshourisman the info is duplicated so we have to avoid that, right?
Thanks in advance.
@mayela I think the fix for this is:
diff --git a/cities/management/commands/cities.py b/cities/management/commands/cities.py
index a71b359..1972838 100644
--- a/cities/management/commands/cities.py
+++ b/cities/management/commands/cities.py
@@ -1005,6 +1005,19 @@ class Command(BaseCommand):
region__country=pc.country)
except Subregion.DoesNotExist:
pc.subregion = None
+ except Subregion.MultipleObjectsReturned:
+ self.logger.warn("Found multiple subregions for '{}' in '{}' - ignoring".format(
+ pc.region_name,
+ pc.subregion_name))
+ self.logger.debug("item: {}\nsubregions: {}".format(
+ item,
+ Subregion.objects.filter(
+ Q(region__name_std__iexact=pc.region_name) |
+ Q(region__name__iexact=pc.region_name),
+ Q(name_std__iexact=pc.subregion_name) |
+ Q(name__iexact=pc.subregion_name),
+ region__country=pc.country).values_list('id', flat=True)))
+ pc.subregion = None
else:
pc.subregion = None
but I don't really have time to add it to the test data and verify it.
works for me
any update on getting this fixed merged into master?
@pstreck No update yet, but I did get laid off last Friday! So I'll have some free time later this week to fix this, add the test data, and push it to PyPI.
I've been working on this, but I'm having trouble reproducing the issue. If it's not working for you, please post your cities configuration variables from your project's settings.py
module.
I have tried commenting out the entire CITIES_FILES
variable in test_project/settings.py
and then running:
PYTHONPATH=. python test_project/manage.py cities --import=all
from the root of the repository (Python 3.5.3, Django 2.0). All subregions seem to import successfully.
If you can reproduce the issue that way, please also set TRAVIS_LOG_LEVEL
to DEBUG
when you run the command:
PYTHONPATH=. TRAVIS_LOG_LEVEL=DEBUG python test_project/manage.py cities --import=all 2>&1 | tee everything.log
and copy/paste the everything.log
file to a pastebin. Please include your Python and Django versions.
I'm not done wrestling with this, but I don't have as much free time as I thought I would to fix it. Any direction or further troubleshooting information you can give me will help me. Thanks!
@77cc33 It "worked" in that it doesn't error out anymore, but it doesn't connect up postal codes to their subregion in that case either.
I am facing similar issue. While importing postal codes, I am getting following error. Please find the stack trace.
Importing postal codes: 23%|█▍ | 293363/1264588 [2:12:35<7:18:56, 36.88it/s]Traceback (most recent call last):
File "manage.py", line 15, in
execute_from_command_line(sys.argv)
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/django/core/management/init.py", line 371, in execute_from_command_line
utility.execute()
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/django/core/management/init.py", line 365, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/django/core/management/base.py", line 288, in run_from_argv
self.execute(*args, **cmd_options)
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/django/core/management/base.py", line 335, in execute
output = self.handle(*args, **options)
File "/usr/lib/python3.5/contextlib.py", line 30, in inner
return func(*args, **kwds)
**File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/cities/management/commands/cities.py", line 160, in handle
func()
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/cities/management/commands/cities.py", line 1061, in import_postal_code
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/cities/management/commands/cities.py", line 1029, in import_postal_code
Q(city__region__name_std__iexact=pc.region_name) |**
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/manimaran/4.0/v4/lib/python3.5/site-packages/django/db/models/query.py", line 407, in get
(self.model._meta.object_name, num)
cities.models.MultipleObjectsReturned: get() returned more than one District -- it returned 2!
I've deployed this and hit this error. Django 2.0.7, Python 3.7 in a virtenv.
I'm able to avoid this error if I only import certain countries, such as US, CA, AU. I tried UK and this failed.
I've just put in the code you created higher in this thread and trying a full import but another comment suggested this doesn't link those post codes, so I'm a little worried about those. I would really appreciate if this could be fixed and data is consistent. Maybe post your full config so we can confirm what the difference in your settings are?
I'm using SQLlite for testing, my settings.py
CITIES_VALIDATE_POSTAL_CODES = True
#CITIES_POSTAL_CODES = ['US', 'CA', 'AU']
CITIES_FILES = {
'city': {
'filename': 'cities1000.zip',
'urls': ['http://download.geonames.org/export/dump/'+'{filename}']
},
}
Same here importing cities1000.zip.
Is this project dead?
Maintainer (sort of) here. I don’t have a lot of time for this project as I once did and I could use some help. If anybody can put this in an up-to-date PR that would help (make sure you include tests, even if it’s mocking a response or a file), but long term I may need to turn over maintenance to somebody else.
If anybody has time, email coderholic directly, explain a bit about who you are, include a few links to your open source contributions (especially Django apps), and ask to be made a maintainer. Please keep me on as a maintainer, as I still have an invested interest in this project and I may have more time for this project in the future.
@blag
I have no experience with opensource projects but if you want I can help you. At least I can show how to crash django-cities. (Maybe afterwards I can write also tests and fixes)
The topic issue still happens.
Traceback (most recent call last):
File "/home/me/project/app/manage.py", line 24, in <module>
main()
File "/home/me/project/app/manage.py", line 20, in main
execute_from_command_line(sys.argv)
File "/home/me/project/env/lib/python3.7/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/home/me/project/env/lib/python3.7/site-packages/django/core/management/__init__.py", line 375, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/me/project/env/lib/python3.7/site-packages/django/core/management/base.py", line 323, in run_from_argv
self.execute(*args, **cmd_options)
File "/home/me/project/env/lib/python3.7/site-packages/django/core/management/base.py", line 364, in execute
output = self.handle(*args, **options)
File "/usr/lib/python3.7/contextlib.py", line 74, in inner
return func(*args, **kwds)
File "/home/me/project/env/lib/python3.7/site-packages/cities/management/commands/cities.py", line 160, in handle
func()
File "/home/me/project/env/lib/python3.7/site-packages/cities/management/commands/cities.py", line 1006, in import_postal_code
region__country=pc.country)
File "/home/me/project/env/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/me/project/env/lib/python3.7/site-packages/django/db/models/query.py", line 412, in get
(self.model._meta.object_name, num)
cities.models.MultipleObjectsReturned: get() returned more than one Subregion -- it returned 2!
The same issue
CITIES_FILES = {
'city': {
'filename': 'US.zip',
'urls': ['http://download.geonames.org/export/dump/'+'{filename}']
},
}
CITIES_LOCALES = ['LANGUAGES']
CITIES_POSTAL_CODES = ['US']
Someone manage to make this work?
The same issue
CITIES_FILES = {
'city': {
'filename': 'cities1000.zip',
'urls': ['http://download.geonames.org/export/dump/'+'{filename}']
},
}
Same here for Districts when import of postal_codes starts.
So it seems to be a problem with the data coming from geonames.org ?
CITIES_FILES = {
'city': {
'filename': 'DE.zip',
'urls': ['http://download.geonames.org/export/dump/'+'{filename}']
},
}
(django 3.1.2, python 3.8)