S3ObjectFinder.list_to_download is returning a list that includes objects already in the database
Closed this issue · 5 comments
In routines.py
, S3ObjectFinder().list_to_download()
should return a list that does not include any bags that are currently saved to the database. However, it is returning a list that includes some bags that have been saved to the database.
In my our bucket linked to development Zorya, there are 3 bags. All 3 bags have the same name as bags saved in the database, so list_to_download
should return an empty array. However, it's returning an array with one of the bags.
The files_in_bucket
are ['4977307ee0f2493d984484cdb30dbb2b.tar', 'f70e24901427497b8caed6c4d234e7db.tar', 'f7254e2aadc14c849bd6edde66d92307.tar']
. The return of list_to_download
is ['f70e24901427497b8caed6c4d234e7db.tar']
. If I run Bag.objects.filter(original_bag_name__contains='f70e24901427497b8caed6c4d234e7db.tar').exists()
outside of this function in the Django shell, it returns True
(as it should).
I'm not sure what is happening?
@bonniegee I think this might be because if you have an if/elif
when I think what you want is really multiple if
statements. I suspect what's happening is the first if
is getting triggered so the elif
is never executed.
Also, looking at this after a minute, I think that line 46 should use the exists
method: Bag.objects.filter(original_bag_name=join(self.src_dir, filename)).exists()
Thanks @helrond -- I'll try if/if.
I'm not sure where I should add the exists
method--line 46 is a docstring and line 52 is Bag.objects.filter(original_bag_name__contains=filename).exists()
.
If/if is returning the same list 😕
Oops I was looking at an older version of the code.
I don't know why yet, but your for
loop is only getting executed twice, when really it should be executed three times. I'm not sure if the remove
is affecting the iterator. I would suggest just turning lines 49-54 into a one-liner:
[filename for filename in files_in_bucket if not expected_file_name(filename) and Bag.objects.filter(original_bag_name__contains=filename).exists()]
There's probably a more elegant way to write that condition at the end using bool
or all
.