emory-libraries/eulfedora

progressbar versions

Closed this issue · 6 comments

With python 2.7, we were getting the error with regards to progressbar:
AttributeError: 'module' object has no attribute 'DataSize'

As it turns out, we had version 2.3, which pip install gave us. Looks like we need newer versions of progressbar that include DataSize under widgets, and those newer versions fall under progressbar2.

Updating progressbar to progressbar2 here worked for us:
https://github.com/emory-libraries/eulfedora/blob/master/setup.py#L58

But didn't want to submit a pull request yet, as maybe there was more thinking behind falling back to progressbar(1) there.

@ghukill thanks for catching this. You're right, it should be progressbar2 instead of progressbar. I think at one point progressbar was an optional dependency (since it isn't technically needed to run eulfedora), but it looks like that's not the case anymore. I'm not sure of the cleanest way to handle that.

It does look like the line you identified in the setup script is a simple fix, though, and should be corrected. If you want to do a pull request, go ahead. Otherwise, I'll change it when I get a chance.

Sounds good - we can definitely submit a pull request with regards to progressbar.

Might have another small one from syncutil as well, for particularly long datastream labels...

Great.

I think we ran into the same issue with syncutil; there are a couple of fixes in the develop branch that haven't been released yet (see 2042420 and cb55302), and we're also planning to add some better error handling.

Those commits are exactly it. But food for thought: we had to set ours at 750 to safely get through a handful of objects with particularly long labels. We also had to bump up the number on line 206.

Is the danger of setting it considerably higher that you might encroach on another tag? Pondered if you could save the length traversed backwards from len_to_save (or some way similar), but that was at week's end and haven't yet revisited.

@ghukill Do you want to open a second issue to track the syncutil issue?

Setting the length higher is probably better than failing because there isn't enough context present. It should be possible to adjust the regex to find the last occurrence of the datastream info tags we're looking for, and probably not to hard to manufacture a test case that includes multiple datastream tags to confirm it gets the expected values.

We also ran into a problem with old records that had -1 datastream sizes (old fedora bug); I think it's probably ok if those fail, since they should be cleaned up, but I wondered if you all had any thoughts.

Merged in #16 and released eulfedora 1.5.2