bagit-profiles/bagit-profiles-validator

Improve abstractions

tdilauro opened this issue ยท 9 comments

Currently, this library has two key abstractions: the BagIt Bag and the BagIt Profiles Profile. There are other types for reporting and exception handling, but these are they primary ones.

In the current implementation, Profile has a lot of responsibilities. It holds a model (described by some version of the BagIt Profiles Specification, it performs validation of a profile de-serialized into the model, and it validates bags against that profile.

Ideally, we would have separate types representing

  • the model,
  • the profile validator, and
  • the bag validator.

Would appreciate your thoughts on this, @ruebot and @jscancella.

Abstraction is good, but what are the current pain points? What issues if any will this cause for backwards compatibility? Who is currently using the library and what are their thoughts?

@jscancella Excellent questions. Thank you!

Abstraction is good, but what are the current pain points?

A few of my pain points:

  • As a person who develops BagIt Profiles, I want to be able to design and validate profiles ahead of time, so I can know that a profile is correct before I introduce it into my bag validation workflow.
  • As a system that validates bags, I want to be able to concurrently validate multiple bags against against a verified profile, so that I can improve my performance.
  • As a maintainer of the BagIt Profiles Validator, I would like to be able to develop better tests, so that I feel more confident in their correctness and reliability.

What issues if any will this cause for backwards compatibility?

I think it would be possible to maintain backward compatibility with the command line utility, but there would definitely be differences in the way the library works. Semantically, I would expect this to be a major version release.

Who is currently using the library and what are their thoughts?

I would def like to get community feedback relatively soon, but just wanted to get thoughts from the two others working most closely on the profiles and/or validator right now.

Excellent, looks like you have thought about this for a while. I remember when I completely redesigned bagit-java because it was hard to work with. Undoubtedly you will refine your ideas as you go, so my advice would be to iterate. Only after rewriting bagit-java like 4 times do I feel like it is probably the most optimized it will ever be while being simple to use.

I say go ahead and plan out some high level changes. Then make a branch or fork and make something we can see and play with, that will help us have more constructive feedback

Thanks! I have indeed been thinking about it and have already done some work. Hopefully have something to share as a first iteration soon.

@tdilauro this is great! Really happy to see it ๐Ÿ˜„

We don't use BagIt or BagIt Profiles at my institution, and I'm only there ~15% of my time right now, so feel free to have at it.

@helrond please jump in if you have any thoughts.

Again, really happy to see folks actually using this in reality, rather than my hypothetical use cases at York.

I don't have anything to add, other than I think this sounds promising and would be happy to take a look at code as it develops.

@tdilauro do you have a pypi.org account? Just realized I should get you added as a maintainer ๐Ÿ˜„

I didn't, but now I've added on: iamtimmo