backup/backup

Looking for maintainers

tombruijn opened this issue ยท 73 comments

Hi Backup users,

Tom here, current maintainer of the Backup gem.

After two years of maintaining Backup, I don't have a lot of time or motivation anymore to keep track of all the issues and actively maintain the gem. And the original creator, @mrrooijen, is busy with other things.

So we're looking for people to help out and maintain the gem. I will still help out a little bit and help answer questions you have best I can.

Comment on this issue if you're interested.

My employer is a Ruby on Rails shop, and we use backup extensively, so I'd be willing to do this.

Hi @stuartellis ! Great to hear. I've given you contributora access. Let me know if you have any questions or need an introduction to the internals. You can contact me on my email on my GitHub profile, on Gitter, or let me know what your favorite communication method would be.

Hi @tombruijn - I've accepted access and sent you an email. I'll now be on the Gitter room, as time allows (i.e. unfortunately I can't be there in UK working hours).

@tombruijn I'm a big backup fan and the work you and the community have done is amazing. I'd be happy to help maintain this library, I'm also really wanting to take a stab at the Dropbox APIv2 before it deprecates in June. ๐Ÿ‘

Hi @shakycode, sorry for not responding sooner. As you can see I don't have a lot of time to spent on Backup at the moment.

I've given you contributor access. It would be great if you could take a look at the dropbox API as you suggested :)
Let me know if you have any questions or need an introduction to the internals. You can reach me via email and Gitter chat.

I can throw my hat into the ring as well. I'm a big fan, and I have some time here and there to at least comb through issues and PRs periodically.

Hi @oogali, great! Thanks for wanting to help out!
I invited you to the backup org. Let me know if you need any help getting familiar with the project. Also, feel free to join the Gitter chat rooms if you haven't already https://gitter.im/backup/backup and https://gitter.im/backup/backup-dev

I'm finally back from a hiatus and working on dropbox api v2 compatibility. I can also probably take the Digital Ocean spaces feature and work it in too since I'm very familiar with their API. Not sure on Google cloud platform, as it seems it's not a major request. cc: @tombruijn thoughts?

Hi @nynhex, I was just considering this weekend to take out the dropbox storage now that the v1 API is disabled. It is non-functional now, right? If you want to pick it up: great!

My reasoning for closing the two feature request issues is that Backup is now in maintenance mode and adding something based on one user request is not something I would want to do. Only if there's more demand for it we should consider it. If people send in PRs and are willing to maintain the feature after it's merged, I won't say no to that.

Hi. I'm using backup gem on some of my servers on a regular basis and experimenting with various storage providers regularly, so I think I'll be good at dogfooding.

I was a maintainer of this gem before, alas with a short (some months) window of real activity. So I don't think I'll be a good "main" maintainer. But I'd be happy to contribute!

Commits and merges: https://github.com/backup/backup/commits/master?author=tomash

hi again @tomash !
Any help is appreciated. No one does this full-time. I just look through some of the new issues every other weekend.

If you want I can add you as a member on the GitHub organization?
You're free to decide how much you want to be involved, responding to issues, maintaining integrations or both.

Hi @AdibMurshed ,

Thanks for showing interest in the Backup project!

I'm not quite sure what you mean with "I would like to claim this project". Can you clarify?

I encourage you and your team to contribute to this project and I can even add you as contributors to the GitHub organization, but it wasn't my intention to transfer ownership of this project at this time.

@tombruijn I've used Backup for awhile and would like to join the Maintainers group to help keep this project going.

Thanks!

@tombruijn @stuartellis I'm open to having more maintainers. Makes live easier cc @pwelch

Hi @pwelch, that's great news! I've send an invite! Let me know if you need any information about the internal of the project

@tombruijn Got it! Does most of the maintainer discussions happen in the project Gitter?

@AdibMurshed alright, not sure what the details would entail.

Backup is free under the MIT license so you can do whatever you want with it as long as you keep the copyright and license intact. And don't abuse the project or its name in any way. Backup, in this main repo, is a community project, by and for the community. Ruby gem publishing rights will remain with the Backup organization owners though.

Again, if you or your team want to contribute, you are free to do so to this repository. No need to claim an open source MIT project to do so. Also, I have considerable doubts about the strategy to try and "claim" any project that's looking for maintainers, as I see you've send the same message to quite a few other projects.

@pwelch I guess we do use Gitter for most discussions. Currently it's a bit quiet as most people are busy with other things.

@stuartellis is currently working on setting up the integration test suite again with Docker: #857, #906, #907, #908.

And we have a PR for fixing Dropbox support in #901, which we still need to fully review.

We've put the project in maintenance mode for now as we try to figure out which services we still want to support, and take out those that we don't. I would still want the backup pipeline to be revised one day as it's currently not very efficient or flexible.

If you want to know more, feel free to ping me @tombruijn on Gitter in the #backup-dev channel.

I'm using the backup gem for a bunch of my customers and I hope to keep doing it. I'd be interested to help with maintaining and extending it (as you can see with my open stack support PR)

Hello @elthariel - we would certainly welcome the help! The previous comment from @tombruijn summarizes the current state of things. I check messages on the #backup-dev Gitter channel every day or so.

Thanks for the answer.
Not exactly a gitter user here, I'm kind of tired of the multiplication of the communication channels. But I'll try to drop by to discuss the code of my PR on my next free cycles :)

Hi, I'm using backup in production and I do have an interest to implement IPFS as a storage backend. So, you can sign me up to the list.

Looks like there are a few of us now. Thoughts on scheduling a meeting (possibly using the Gitter channel?) to discuss moving forward for triaging issues and working on the next release?

Great to see people are interested in helping out!

I have a couple thoughts about continuing backup rather than just fixing immediate bugs and adding new components.

In my opinion there are 2 major problems with backup that should be addressed before continuing work on backup.

First and biggest issue is the backup pipeline. Currently it creates intermediate files that are used temporarily. Which means that if you have a 500GB backup you'll need 500GB of free space while making the backup. This is inefficient and costly. Instead we should look at a real pipeline where streams of data are passed from one step to the other without writing to disk. It basically works like this:

# Example using pipes
# backing up a database dump
pgdump ... | gz ... | openssl ... > backup.sql.gz

# or archiving files
tar ... | gz ... | openssl ... > backup.tar.gz

# sending it directly to AWS S3
tar ... | gz ... | openssl ... | backup_s3_uploader

This is quite a massive change and would mean we need to touch everything in the pipeline and the components to get them to work in this new pipeline.

The other issue is the list of components that are part of backup. Having support for many different databases, storage options and notifiers creates many dependencies on the gem, most of which users won't actually use.

We could solve it by splitting backup in several gems so users can pick and choose which gems they actually need. This allows the main backup gem to move forward quicker, but also creates the problem that other gems will need to play catch up with any core changes made in the main gem.

Alternatively we can choose to make it language agnostic, something I'd at least want to introduce in some way, calling any command as part of the pipeline.

When splitting the main backup component, the job manager, and the other components, as long as they work as a pipeline the components can be written in whatever language.

$ backup perform my_backup_job
Starts this pipeline:
backup_archiver | backup_compressor | backup_encryptor | backup_s3_uploader

Anyway, that's my thoughts on it. I've been playing around with setting up a new backup pipeline while learning to write Rust.
Let me know your thoughts on this and how you would like to proceed.

@tombruijn Sounds like a lot of work but it would be a big improvement.

Were you thinking of rewriting back in Rust or just moving some pieces to Rust and using a Rust/Ruby FFI integration?

Thoughts on starting a project call to start laying out some ideas? Might be hard to plan with timezones but sometimes voice calls are easier. We could also start an RFC issue to lay out a roadmap.

@pwelch It's a bit work, yes. The original Backup author and I have started work on this two times, but we never quite finished it. (He also no longer seems interesting in it.)

I've thought about rewriting it in Rust, as a binary, not a Ruby extension. But I don't think Backup, in its entirety, would have to be in Rust.

The advantage of Rust is not having to install the language dependency of Ruby just to get your backups set up. For the user it's (hopefully) a matter of downloading a package and creating a config file. The downside is that we, the project, need to compile, package and distribute [1] the Rust backup for all different kinds of architectures, e.g. 32- and 64-bit, libc linux, musl linux, *BSD, macOS, etc. (I have very good experiences with cross though.)

Keeping some optional parts in Ruby would allow us to leverage the many available gems for functionality not available in Rust packages or not stable enough yet. Allowing other components of Backup to be language agnostic allows users to write "extensions" in their language of choice, although I would recommend keeping official extensions limited to Rust and Ruby to prevent users from having to install more than one language interpreter on their systems.

I can try and write something up in the form of a specification about the new backup setup. There may be some documentation I can reuse from previous attempts at improving the gem.

[1]: Something like packagecloud's open source plan looks interesting though: https://packagecloud.io

@tombruijn Sounds like a good long-term plan. I enjoy the flexibility Ruby offers but having a single binary to deploy is nice.

Either long-term or short-term we could consider something like Omnibus. It can be used to build packages for different platforms. I know Chef and GitLab both use it for packaging.

@pwelch There is an option to use Omnibus to build a single package with all the dependencies for every platform we want to support. It is not that difficult, I have an experience with.

@tombruijn @salsa-dev Thoughts on spiking an implementation in a new repo (backup/backup-omnibus)?

@pwelch @tombruijn laid out the second important issue:

The other issue is the list of components that are part of backup. Having support for many different databases, storage options and notifiers creates many dependencies on the gem, most of which users won't actually use.

We could solve it by splitting backup in several gems so users can pick and choose which gems they actually need. This allows the main backup gem to move forward quicker, but also creates the problem that other gems will need to play catch up with any core changes made in the main gem.

If backup is installed as an omnibus package the problem of dependencies is eliminated because the deps themselves are embedded and plugins could be installed during configuration into omnibus package directory like /opt/backup/embedded/...

So, I believe Omnibus can solve that problem. Ready to support development in backup-omnibus repository.

I'm interested in an omnibus package. If we can package something that doesn't require the user to think about the installation (e.g. installing Ruby etc) that would be amazing!

If backup is installed as an omnibus package the problem of dependencies is eliminated because the deps themselves are embedded and plugins could be installed during configuration into omnibus package directory like /opt/backup/embedded/...

Not entirely sure how that would work. Can you configure which plugins to install during installation? Such as "mysql" database support, "s3" storage and "mail" notifier? Or would everything still be installed all the time?

My problem with the latter is that it's also a drain on development to keep every supported feature up-to-date all the time. I would rather allow the main package to keep moving forward faster while users can use plugins on older versions as well.

@salsa-dev I've created a backup-omnibus repo and invited you as collaborator.

I've also set up a maintainer team to allow for easier permission management on my end, sorry if any of you got emails for that.

Treasure Data Agent (formerly known as fluentd) operates a model similar to what's being proposed.

The overall package is installed into /opt/td-agent, which consists of the 'blessed' versions of packages it needs rather than relying on the administrator/ops/end user to get the correct versions.

For example, /opt/td-agent/embedded/bin contains:

  • Bundler
  • OpenSSL
  • Ruby 2.1.10p492
  • xz

To handle the plugins (which the corollary for backup would be the notifiers and other commands as you listed in your example), Treasure Data extracts these as different gems/projects, which are installed by the end user.

https://docs.fluentd.org/v1.0/articles/plugin-management

The result is development of new integrations (notifiers, extractors, etc) are decoupled from the core, and each of those plugins have their own maintainers rather than the core project assuming responsibility for each and every plugin.

I'm not familiar with fluentd. Would it be an alternative to omnibus? or something we can use in the omnibus package?

I would love to see a small example app (of either) so I can see what's involved with it if anyone knows of any.

At first I think we should provide a omnibus-like package as an option before switching to it as a recommended approach when enough installations have proved successful. Interested to see what the installation process will look like.

I'm very familiar with fluentd as I use it at work in our replication pipeline. There is a way to use it in an omnibus package however it would require a bit of lifting.

Using fluentd as a backup streamer would be interesting, but there's going to be a good bit of implementation and tooling required.

Anyone following along about the pipeline changes I suggested, I said I would write something up for the new pipeline and this took way too long. It's finally here:
https://github.com/backup/backup/tree/rfc/pipeline

I've created a new branch that describes the RFC (Request For Change) on the Backup gem's pipeline. Any changes to it will be added as new commits. It also touches on a new configuration structure (and method) along with suggestions for implementation language of the tool. Please let me know your feedback on the Pull Request for this RFC: #925

Whoa, I am not advocating using fluentd in any part of backup, or bundling it with backup whatsoever, or equating it to Omnibus.

I'm simply mentioning that there is a pre-existing example of another open source project that has successfully bundled binaries/dependencies with the core, and decoupled the features (filters, parsers, notifiers, etc) in a way that they can be installed as separate plugins which each have their own individual maintainer.

My reply was largely in response to:

Not entirely sure how that would work. Can you configure which plugins to install during installation? Such as "mysql" database support, "s3" storage and "mail" notifier? Or would everything still be installed all the time?

My problem with the latter is that it's also a drain on development to keep every supported feature up-to-date all the time. I would rather allow the main package to keep moving forward faster while users can use plugins on older versions as well.

@oogali Gotcha, thanks for the clarification. We could introduce fluentd but I don't think we're doing any new integrations. Just keeping things maintained for now.

To add some more color (which I should have researched and done before my comment), Fluentd is using Omnibus:

https://github.com/treasure-data/omnibus-td-agent

So in my opinion, @salsa-dev is absolutely on the right track.

(Also, I think this issue should be closed as it's diverged from the original topic of looking for maintainers, into going down the path of specific research/RFCs)

Well I wouldn't close the issue as we're always looking for more people to help out.

But yes, let's discuss any particular issues in their specific issues. Feel free to mention me on any of those if you need feedback. I'm also available on Gitter: https://gitter.im/backup/backup-dev

@tombruijn can I get added to the omnibus repo as well? I can try to give @salsa-dev a hand if they need it.

@pwelch sorry my lack of response. I was on holiday for a bit. You should already have writing permission on the repo.

@tombruijn given the activity here - is it possible to take down the maintenance mode from the readme? Seems like things are thriving and it drive folks away from a very solid full featured tool.

flxwu commented

Dear @tombruijn and Backup Team,

in case you are still searching for maintainers, please consider adding the project on maintainerswanted.com ๐Ÿ˜„ It's a site that I built inspired by a tweet by Sara Viera and I am about to launch it in the next few days, it'd be really cool to already have some projects onboard till then!

Thanks!

Best,
Felix

Any possibility of involvement? The project seems to have stalled for a while, and I'd like to get involved if possible.

It seems both @mrrooijen and @tombruijn doesn't have any time to handle maintenance anymore, not to get involved in the discussion here.

I've decided to fork the project to a different organization and started to collect some of the waiting PRs here. Feel free to ping on backupii#1 if you want to get involved

GoBackup is a fullstack backup tool design for web servers similar with backup.

https://github.com/huacnlee/gobackup

I'm not willing to let this project go away, we use it all over and it is too damn awesome to let it go away. This is a long thread, what is the current status of need for maintainers?

@jmcdonagh I think the need is still there. It might be worth making a chat in the Project Gitter to formalize things and what to do next?

@jmcdonagh I think the need is still there. It might be worth making a chat in the Project Gitter to formalize things and what to do next?

@pwelch i don't know what "make a chat" entails. is there some kind of GitHub slack I am unaware of?

@jmcdonagh Yep. Gitter is a chat program for GitHub repos. Go to the README page here and click on the Gitter badge. It might be a good place to start a chat about future tasks or a new issue maybe?

Backup is still looking for people to maintain it if anyone is up for it. I don't have the motivation or need for it, as I don't use Backup myself anymore. I've given people maintainer access before, but I don't think they're active anymore either? (I'm not up-to-date with the status of the project.)

@elthariel if you want to help out in this project (since I see you've already forked it) and merge https://github.com/backupii/backupii into it (for example) I'm also okay with that.

As before, I'm happy to give access to people access to pick up issues and work on the project. I am a little reluctant to hand over the project entirely as I am not even the person who owns it. That responsibility falls on @mrrooijen as the original author.

I'm open to adding new maintainers to the project.

I haven't made use of- nor maintained the project in a long time, so I'm completely out of the loop.

We could have a discussion about how to move forward.

@tombruijn I'd love to merge back backupII into backup, although I haven't been doing much interesting work there yet (merged a few nice PRs, did a few chores and a lot of useless forking related grunt work). Tbh, I was hoping for this to happen.

@mrrooijen, I'm sure we can find similar stories on other open source projects and there's probably a way to move toward a community-driven governance that can make you feel comfortable. One example could be to give collaborator access to anyone who successfully submits a large enough PR (to be defined, but something that is more than a typo fix), and require 2/3 approvals from collaborators to get a merge on master. As for the release cutting part, there's probably a nice way to automate this as well to make it community driven.

This is just a suggestion, ofc, we can have a look around to see what have been working for other projects.

Btw, If you don't want to spend a lot of time discussing this over github, we can setup a phone call, as it seems we're all on the same timezone

Sure, having it be community-driven in the way you describe sounds good to me. My primary concern would be to prevent malicious actors from getting access in some way. E.g. limit the amount of people that have access to RubyGems.org, and make it mandatory to enable 2FA for those who do. The maintainers will filter out any potential malicious code that could be submitted through PRs.

As long as the above is enforced I don't really mind in which direction the community takes the project.

Regarding 2FA, I do have it enabled :) I really understand your concern about malicious code, I share it. Backup is a critical process.

If you want, I could draft some basic governance policy (I'll probably try to borrow something from another project) and submit it as markdown for the root of the project. This way we can agree about the foundation on which we'll move forward

Regarding 2FA, I do have it enabled :) I really understand your concern about malicious code, I share it. Backup is a critical process.

Yeah, having backups hijacked isn't exactly pleasant.

If you want, I could draft some basic governance policy (I'll probably try to borrow something from another project) and submit it as markdown for the root of the project. This way we can agree about the foundation on which we'll move forward

Sure, if you're up for it go ahead and we'll take it from there.

Although I'm no fan or node.js, I think their model is a good starting point. It needs to be strengthened a bit on a few points.

Here's their doc: https://github.com/nodejs/TSC/blob/master/BasePolicies/CONTRIBUTING.md

What would be great would be to have @mrrooijen and @tombruijn in the TC in the unlikely event of a dispute, and have the TC handle release cutting

@AdibMurshed, feel free to recommend any other documents that you feel has good ideas about this type of governance model

@mrrooijen Here's the draft, based off node's community standards, with a few additions related to your suggestions
@AdibMurshed WDYT ?

The contribution guide looks fine to me. You can always amend it in the future if needed, but it seems like a good foundation. ๐Ÿ‘

I'm not a big fan of the Code of Conduct. The fallout in the Opal community a few years back left me unimpressed with the author and associates. That being said, if the Backup community wants a/the Code of Conduct then I'm in no position to argue against it considering the lack of involvement with the project.

@mrrooijen. I was talking about the code of conduct with a friend of mine and he pointed me to the relevant Opal thread. I was pretty disappointed at the authors and regretted that I suggested using it here, so I'll happily remove it. I'm part of the LGBT community myself, and had very mixed feelings about this (I immediately thought of Thinkpol).

I'll remove it in favor of something simpler, like the traditional MINASWAN ("Matz is nice and so we are nice")

edit: pr updated

@mrrooijen. I was talking about the code of conduct with a friend of mine and he pointed me to the relevant Opal thread. I was pretty disappointed at the authors and regretted that I suggested using it here, so I'll happily remove it. I'm part of the LGBT community myself, and had very mixed feelings about this (I immediately thought of Thinkpol).

Yep. Someone actually told him that he had the "wrong opinion", so there you go.

I'll remove it in favor of something simpler, like the traditional MINASWAN ("Matz is nice and so we are nice")

Sure, that seems reasonable. Short and to the point.

@mrrooijen I've updated the PR and rebased. In your opinion, what should be the next steps to move forward on this ? :)

@AdibMurshed I personally don't think backup is a big enough project to justify/require the creation of a legal entity, which usually exists to handle issues related to scale, funding and/or intellectual properties, and place a decent amount of administrative cost and burden.

As for the Fedora governance model, I assume you're talking about the global governance summarized here ? If this is the case, I also don't think the backup community is active/big enough (yet) to have such a model work out properly (requires a legal entity, lots of voting, etc.)

But maybe you are talking about certain specific mechanics that you think would be beneficial and that would be a good inspiration ?

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Hey, are you still looking for maintainers?

@deanpcmad Yes, we are. Feel free to have a look at the CONTRIBUTING.md file about how to become a maintainer

Hi! I just saw your entry on the Maintainers Wanted list.

I just published a similar project that's now looking for postings. I'd greatly appreciate if you find some time to add your project!

https://seeking-maintainers.net

I would also like to become a maintainer of this project. I'm interested in modernizing this project to run on current versions of ruby and exploring the possibility of adding a GIT repository as a backup storage option. To get there, I plan to focus my initial efforts on the following areas that I see needing the most attention:

  1. Modernizing the vagrant/Vagrantfile to use a basic, fully reproducible Configuration as Code approach to setting up the box rather than depending upon a mystery box with an unknown configuration (PR ready to be submitted).
  2. Documenting how to run backup using vagrant.
  3. Replace the image used in the Dockerfile with one that exists and is defined in Infrastructure as Code---either Ansible or Chef, depending on other contributor's preference (I will use Ansible if no preference is given).
  4. Review the GitHub actions and see if improvements similar to # 3 are needed and if they are, make them.

I'm not stepping up to be a maintainer (again) due to time constraints, but I'll be happy to help with some things, like Dockerfile, fixing tests (currently failing on master branch, at least with Ruby 3.1.4) and other incidental things to keep it running.