tildeclub/tilde.club

BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS

Closed this issue · 40 comments

BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS
BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS
BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS
BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS
BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS
BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS BACKUPS

how can we help?

What do you think of a cron-job with rsync?

to where...

Thank you for asking! I don't know the easiest, shortest path to safe, testable, periodic backups for tilde.club or a unix box in general. Right now I do a full backup of /home /etc /var and mirror that to my laptop when I remember to do so. I can also script that to do it every night. But it's a meh solution.

People have recommended tarsnap and it seems great and worth paying for. Good use of donations.

Another option is just imaging a full disk nightly.

If someone could say "this seems to be the right way forward" and then update this github issue with an implementation plan that would be ideal.

Good things are: Automatic, cheap, periodic.

Better things are: Versioned, per-user preferences.

I think that rsyncing /home at 00:00Z would be a good way to do this, as (if you didn’t already know) rsync is Git-based, so it’s free and versioned. Cron would take care of the automatic and periodic criteria. There’s no way to give users control over preferences, but here’s what I’d do if I were a user who wanted more frequent backups.


  1. Set up an shhfs. I’ve already done this with my totallynuclear.club account.
  2. Set up a cron or anacron job (again with rsync, some other backup system, or vanilla Git) to back things up whenever desired.

It’s not a perfect solution, but it’s pretty good.

What kind of system do we want to end up with?

  • How long will we keep the backups? 1 Week? 1 Month? Do we want to keep some things longer than other things? e.g. /home gets kept a month and /var gets kept 1 week
  • Do we want self service, meaning the user can grab their own files out of a backup? (It's nice, but perhaps a bit much for a first system unless somone's done the work)
  • Other stuff?

Tarsnap seems like a way to handle it, especially form a get something that works more or less OK happing right now (packing up to a laptop is a bit scary), but I think it's worth investigating something that we can show people how to do on their own.

I would also look at rsyncing it somewhere to an S3 disk and archiving to Glacier. It's cheap and local (to the server) and designed for this sort of stuff.

If we do go with rsync, I would say set it up to do versioning. Here's an example that I picked off the top of the googlepile but there are many others that may be better https://netfuture.ch/2013/08/simple-versioned-timemachine-like-backup-using-rsync/

I'm not a fan of sshfs (or any mounted fs backups for that matter) based backups as they can be a bit brittle.

A bit tangential to this, but I think that on the education front we ought to be reminding people that they should be actively involved in making their own backups, and not solely relying on whatever backup solution we come up with here. The sysops absolutely need a backup solution in place, and individuals need to maintain their own separate backups.

I like @michaelcoyote's Glacier suggestion, but if we do backups to the cloud, it might behoove us to use a different provider than we use for hosting the service.

@pfhawkins Yeah, that's an excellent point but that may disqualify Tarsnap as they also use AWS. See http://www.tarsnap.com/about.html

I also concur that user education about how to protect your data is a good thing.

Rsync’s versioning is what makes it so attractive. I don’t know how much storage we have, but Git stores things much more efficiently than a sequence of images.

I wasn’t recommending sshfs as a primary backup method; it’s just a way of getting unscheduled backups as a user.

As for physical redundancy, maybe someone could run a mirror (or just hold the rsync files) on a Linode VPS?

We are using Amazon for storage in some way, right? There should be a way to capture a snapshot of the whole volume in EBS.

I'm making a copy of my home directory into a private github repo, checking it in from time to time, so that I don't clobber my index.html file accidentally without a backup.

hey @ke7ofi I'm KD8OQG. see http://tilde.club/~emv/hamradio.club

and thanks for the sshfs tutorial @ke7ofi that's mighty handy

@vielmetti I’m not active any more, but callsigns are never taken as usernames.

We are using Amazon for storage in some way, right? There should be a way to capture a snapshot of the whole volume in EBS.

@vielmetti This is also true and it might be useful to use glacier as more long term storage for this. OTOH, it might be hard to split out things like /home which could get huge and /etc which should stay small. At this point I'm betting it's not a problem.

I should point out that sshfs is probably a good solution for users to back up their own stuff and might be a method we want to encourage. The same with git repos.

I'm also thinking we should figure out if what we really want is disaster recovery and not actually backup. The point being is that we are telling people "if the whole thing goes kaboom, we can pull the entire thing back from last Wednesday's or Saturday's backup", not "we can restore your my_diary file from last night".

Also, I've not forgotten @harperreed 's most excellent point about different infrastructure. We should probably be thinking about that too.. At the very least any target should be in a different geo region.

Good point @michaelcoyote.

  • disaster recovery feels like it is respectful of basic social norms and
    expectations
  • backups are on the users
  • nightly is probably feasible whatever we do
  • like anything, communicating this in as transparent a way as possible is
    the difference between "I understand what this place is and set my
    expectations accordingly" and "WHY I AM LEAVING TILDE.CLUB"

Paul Ford // (646) 369-7128 // @ftrain

On Mon, Oct 13, 2014 at 3:14 PM, Michael notifications@github.com wrote:

We are using Amazon for storage in some way, right? There should be a way
to capture a snapshot of the whole volume in EBS.

@vielmetti https://github.com/vielmetti This is also true and it might
be useful to use glacier as more long term storage for this. OTOH, it might
be hard to split out things like /home which could get huge and /etc
which should stay small. At this point I'm betting it's not a problem.

I should point out that sshfs is probably a good solution for users to
back up their own stuff and might be a method we want to encourage.

I'm also thinking we should figure out if what we really want is disaster
recovery
and not actually backup. The point being is that we are
telling people "if the whole thing goes kaboom, we can pull the entire
thing back from last Wednesday's or Saturday's backup", not "we can restore
your my_diary file from last night"

Also, I've not forgotten @harperreed https://github.com/harperreed 's
most excellent point about different infrastructure. We should probably be
thinking about that too..


Reply to this email directly or view it on GitHub
#52 (comment).

@ftrain so the goal is to keep the community functional in case of catastrophic failure rather than to back it up incrementally?

Incremental disaster recovery backups are good to have. If you only have one recent backup, if that backup is infected w/ a virus or some such, you might wish you had an earlier backup on hand that wasn't so infected.

I am not seeing an answer, but why wouldn't we just do EBS snapshotting? This way, when failure happens, we can just roll to a previous good snapshot and boot the box. It will boot - most data is there.

The users, should back up their own stuff (for many other reasons).

I vote this path because it would require the least amount of work for all of us.

Least work, flexible, common knowledge, robust = win.

I agree with @harperreed lets do a snapshot and let the user handle the way they wanna save their home directory.

@pfhawkins maybe rsync with a month of previous stuff and then snapshots for the latest?

@harperreed's reasoning is sound.

@ke7ofi we could start with the EBS snapshots, and then work on an rsync-based thing later, perhaps, if we still see a need.

Inclined to @harperreed solution if I can automate it; sounds like I can.
Will leave this open for a few days for other people to comment then close.

Paul Ford // (646) 369-7128 // @ftrain

On Mon, Oct 13, 2014 at 5:43 PM, Flynn Milligan notifications@github.com
wrote:

@pfhawkins https://github.com/pfhawkins maybe rsync with a month of
previous stuff and then snapshots for the latest?


Reply to this email directly or view it on GitHub
#52 (comment).

That sounds good. It’s closed by my standards.

why wouldn't we just do EBS snapshotting? This way, when failure happens, we can just roll to a previous good snapshot and boot the box. It will boot - most data is there.

I think this is overall a pretty solid solution. I'm ok with revisiting this in the future, but done is better than perfect.

I say drop a line in the FAQ stating "we can only recover the whole server" and "there are a number of strategies for securing your own files, please choose and use one". It would also be good to add theses strategies to the wiki.

I totally agree on the principle: towards users there are duties ("we can recover the whole server") but they also have duties toward their own interest ("you should backup") - after all, up until now it's a free service, and I don't think the aim of ~club is to replace the internet or all webhosts (however funny that may be to discuss it).

I would just use a more straightforward phrse than "there are a number of strategies for securing your own files, please choose and use one", like "For the love of all that you have most precious to your heart, BACKUP YOUR FILES!!!", or something explicit like that.

Perhaps forgotten today, but there used to be that great James Turinsky's disclaimer: "This is FREE software, and..." https://gist.github.com/citizenk/2397031

@citizenk

up until now it's a free service

Has something important changed?

Nope. There are no plans to charge at the door. I think he was just remarking that these are free shell accounts, and did not mean to indicate that that would change.

Hi everyone. Saying it should be in the FAQ is great. Writing it up and
issuing a pull request is greatest.

First draft of a "user backups" document is in the wiki.

https://github.com/tildeclub/tilde.club/wiki/User-backups

Edits welcomed.

Yes, what pfhawkins said: sorry if I was unclear... I meant that the personal responsibility of each user should also be engaged, even more so due to the nature of tilde.club

@vielmetti I added links to my and ~jeffbonhag’s sshfs tutorials.

Thanks @ke7ofi ! Tutorials are good. I just successfully set up sshfs on my Mac using "brew install sshfs" and following directions.

@vielmetti you should thank @bonhag rather than me.

thanks @bonhag !

On Tue, Oct 14, 2014 at 11:08 AM, Flynn Milligan notifications@github.com
wrote:

@vielmetti https://github.com/vielmetti you should thank @bonhag
https://github.com/bonhag for the tutorial you used.


Reply to this email directly or view it on GitHub
#52 (comment).

Edward Vielmetti +1 734 330 2465
edward.vielmetti@gmail.com

Hi everyone. Saying it should be in the FAQ is great. Writing it up and issuing a pull request is greatest.

Yeah, I know. I got to clear my backlog first :(

I will look around and see about making the recommendations on the wiki a bit more How To.

Hey, you are very welcome! 😄

Edward Vielmetti notifications@github.com wrote:

thanks @bonhag !

On Tue, Oct 14, 2014 at 11:08 AM, Flynn Milligan notifications@github.com
wrote:

@vielmetti https://github.com/vielmetti you should thank @bonhag
https://github.com/bonhag for the tutorial you used.


Reply to this email directly or view it on GitHub
#52 (comment).

Edward Vielmetti +1 734 330 2465
edward.vielmetti@gmail.com


Reply to this email directly or view it on GitHub.

{"@context":"http://schema.org","@type":"EmailMessage","description":"View this Issue on GitHub","action":{"@type":"ViewAction","url":"https://github.com/tildeclub/tilde.club/issues/52#issuecomment-59061498","name":"View Issue"}}

Updated this git wiki to include a procedure for user "backup" (checkin and push) of their tilde.club ~/public_html/ to a GitHub repo.

I have started making periodic (every day or two) disk images. Closing.