INTRODUCTION ------------ This is a small python application that attempts to automate deployments of a large number of (temporary) virtual machines. It's main emphasis is setting up a large number of (OpenVPN) client computers for performance testing purposes. However, codebase is general purpose enough to be used for any purposes requiring a large number of temporary VMs. Currently Amazon EC2 provider is supported, and the support for a "local" provider has been implemented but is broken atm due to architectural changes. Fixing it is trivial, though. To fetch the latest source code, go to <https://github.com/mattock/perftest> Enhancement requests, bug reports, patches and such can be handled from there as well. ARCHITECTURE ------------ Currently there are currently many components: 1) Provider/poller/queuer thread(s) This type of thread may create the temporary VMs. It may also poll for VMs that have reached a state where they can be configured. Whenever a VM becomes ready, it is placed into a queue, from where configurer threads can find it. There is usually only one of these threads. 2) Configurer threads These threads take IPs of active VMs from the queue generated by the provider thread. They then launch one Fabric process per VM. Fabric takes care of configuring the VMs. When it exits, the configurer thread marks that IP as done and removes it from the queue. There are usually many of these threads, as old versions of Fabric (pre-1.3) did not support multithreaded/multiprocess operation. 3) Main program The main program (start.py) is responsible for reading configuration files and parsing command-line arguments, and for launching the appropriate number of threads. 4) Fabfile This program utilizes a standard fabfile understood Fabric's command-line tool "fab". Use of a recent Fabric version (1.2.2+) is strongly recommended. 5) Configuration files Some aspects of this tool are controlled by configuration files in the "config" directory: - ec2.conf: various (sensitive) Amazon EC2-specific settings - cron.conf: cron-specific settings, use to generate a valid crontab - tests.conf: test control data (not the actual test scripts) - ssh.conf: ssh configuration details (saves typing) 6) Resources The "resources" directory contains all files Fabric uploads to the clients. These files can be static (e.g. OpenVPN certificates) or dynamically generated (e.g. the crontab file used for timing the tests). 7) Analysis scripts The "analyze.sh" script drives the awk scripts which at the moment process iperf and dstat logs and generate required output (e.g. Trac/Mediawiki tables, CSV). TEST PROCESS ------------ At the moment the tests require quite a lot of manual work, regardless of all the automation that's implemented: 1) Launch and configure a server instance (start.py or manually) 2) Launch and configure a number of client instances (start.py) 3) Launch "iperf -s" on the server 4) Launch "dstat.sh" on the server 5) Wait for the tests to run 6) Fetch log files from server and clients (start.py/scp) 7) Move the log files to a new subdirectory in logs/ 8) Open dstat.csv in a spreadsheet, creating a new spreadsheet for each test segment, so that each spreadsheet contains only those time periods where server was under load(*). Make sure that the language for all cells is set to English (USA), so that "." is used a the decimal separator. 9) Run analyze.sh with correct arguments 10) Paste the analysis files (logs/logname/analysis-*) to wherever you wish (*) This is a crude and time-consuming method, but algorithmicly detecting the test case borders is error-prone and easily skews the results. DEPENDENCIES ------------ This application depends on the following external libraries and tools: Boto (http://code.google.com/p/boto) Python interface to Amazon Web Services. Not necessary if not using the Amazon EC2 provider/poller/queuer. Tested on version 1.9 on Ubuntu 11.04. Fabric (http://fabfile.org) Fabric is a Python (2.5 or higher) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks. Tested on version 0.9.3 on Ubuntu 11.04. LIMITATIONS ----------- At the moment the configurer threads choke on input requests (as standard Fabric would). Also, Fabric output is even more confusing than normally, given that multiple threads write to the same terminal at the same time. There is no clean way to fix this, except by removing non-fatal logging and/or writing to logfiles instead of stdout. USAGE ----- Edit the configuration files in "config" directory to your liking. The "tests.conf" requires some discussion. Each section header must be accompanied by a test script with the same name in "resources". For example, if test is called "test1", you must create a test script called "resources/test1" or Fabric will bail out. The "time" variable defines how many minutes from the launch of start.py it takes before the test will be triggered (using cron). The "server" variable determines the server against which the test is performed. You almost certainly want to edit the fabfile.py, which drives the client deployment, as well as write custom test scripts. In addition, you need to use the appropriate command-line flags; to see them type $ python start.py -h TODO ---- - Reimplement full "local" provider - Modify for new Fabric versions (1.3+) with multiprocessing support - Make log analysis less painful