================================================================================ ec2-backup ================================================================================ Authors Nicholas Bevacqua, Alexander Beal, and Nicholas Smith Implementation We used the AWS SDK for Python, Boto, to implement ec2-backup. The goal was to have a program that runs on linux-lab.cs.stevens.edu to perform backups of a directory via dd or rsync. Initially, we import our service modules to use our functions. AWS Access keys are required from the end-user to connect to their AWS environment. Since AWS requires that EBS volumes exist in the same Availability Zone (AZ) as an EC2 instance, we specify an AMI ID for every region we can. If the user tries to make a backup onto an unkown region, for instance if AWS opens up a new region, ec2-backup will fail gracefully and inform the user that the region is unsupported. No output except volume id. When appropriate, logging messages are labeled indicate their severity. Challenges >> A keypair and security group must be present Being as we are using boto, which makes it simple for us to interact with the various AWS APIs, we decided to implement the no-nonsense solution to this and create a security group and keypair for the user. The keypair is temporary, so when the program ends it is cleared, but the security group we decided could stick around. Because we're using a new auto-generated keypair each time, the user should NOT have an ssh key specified in EC2_BACKUP_FLAGS_SSH, or else this could conflict with the keypair we try to use. >> If a volume is specified, the instance we launch must be in the same region >> and availability zone or else we won't be able to mount it. This means that first, we have to figure out which region the volume is in, and then get the AZ from it and launch our instance there. If there was no volume specified, we assume some reasonable defaults. >> Waiting for the instance to be fully ready before connecting via ssh can >> take a long time This isn't so much a problem as an inconvenience, however we must wait until the status checks are complete before we connect to the remote instance, or else things could go south if it fails to connect. The result of this is that there is a long (~60sec) delay before we establish a connection. We do not support the EC2_BACKUP_FLAGS_AWS environment variable. Since we deal with the creation of keys and security groups in the script automatically, we did not see the need to keep this environment variable around. If the user wishes to specify the usage of a specific instance type, they can use the AWS_BACKUP_INSTANCE_TYPE flag. It accepts the values t1.micro through m1.large with an error thrown if the type is either invalid or outside of that range. Scalability The code is not organized nicely being as everything is stuck into one file, so extending this with extra functionality will be error-prone. As far as use, the backup functionality is not very smart so it would not be an ideal consumer solution, but it does its job. Also, being as AMIs are also linked to a given region, we were forced to define an AMI to use for each region, so some regions are unsupported if we do not have an AMI defined for them. Issues The script is tested with boto v2.2.2, which is now over 2 years old, so it is possible that changes in boto could break our code. Additionally, our size calculations are a bit cautious. We calculate the total space of the directory being backed up, convert that to gigabytes, floor it, and then add 2 gigabytes. Testing Testing covered all of the base cases. We backed up a local directory on the linux lab with both rsync and dd without any volume specified. We then made changes to that directory and rebacked up to the same volumes created in the first attempt. Finally, a new instance was created with both new volumes attached to it. We mounted the rsync volume and checked that the changes were accurate. The same was done to the tarball on the block device. Additional testing was done with invalid backup methods, invalid directories, and volumes too small. Linting To be sure we didn't miss anything, especially syntax errors, undefined variables, or anything else that could cause the script to fail, we checked the code with pylint <https://pypi.python.org/pypi/pylint> and corrected warnings / errors as necessary. Unfortunately there are still many many many warnings that there are lines that are very long (>80 chars), which is a problem but would require far too much refactoring to be worthwhile. Distribution of Effort Alexander Beal took first run at program. Nick Smith and Nick Bevacqua took turns modifying the code. Nick S. implemented the argument parsing. Nick B. updated rsync and dd. Nick S. worked on EBS issues, including availability zone discrepancies. Nick B. led the testing effort and identified many problems, in addition to solving many of them. All made last minute changes.