This README contains information about both the BES timing tests and the memory profiling tests. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Timing Code Utilizing a stopwatch/timer class, with a use pattern that guarantees that the stop method for the timer will be called, we instrumented the C++ components of Hyrax to better understand where time was being spent in the server. When the debugging is activated for the BES with either the keywords "timing" or "all" the BES will produces timing output that shows the order of execution and the elapsed times of the calling method. More about the stop watch class can be found here: http://docs.opendap.org/index.php/BES_Timing We have instrumented the BES in the following locations: - In the BES dispatch code that processes client commands. We know the total time it takes to handle each XML command in a BES XML request document sent from the client. - In the hdf4_handler, hdf5_handler, ncml_module, netcdf_handler, ugrid_functions, and w10n_handler response production methods. We know the time taken taken to produce the DDS, DODS, DDS, and DMR responses. - Additional timing instrumentation may be added to as we come to understand more clearly where time is being spent. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Memory profiling To see how much memory the BES was using for a particular request, we used valgrind and its 'massif' tool. This provides both detailed and overview information about the heap allocation patterns made by the server. The script 'heap_profiler' can be used to simplify replicating the same set of options when calling valgrind. We call heap_profler with a bescmd file as its argument. The result is a 'massif' output file with the same basename as that bescmd file. Then use ms_print to look at the data in the *.massif file. Examples: The command ./heap_profiler small_aggregations/TRMM_3A11_hdf4_aggregation_small.ncml.dods-bescmd.xml will run the given bescmd file and write three files: small_aggregations/TRMM_3A11_hdf4_aggregation_small.ncml.dods-bescmd.xml.massif small_aggregations/TRMM_3A11_hdf4_aggregation_small.ncml.dods-bescmd.xml.massif.log small_aggregations/TRMM_3A11_hdf4_aggregation_small.ncml.dods-bescmd.xml.massif.out Use ms_print to look at the heap; the .log and .out files are the output of valgrind (which shows file open/close operations with massif) and the timing information (largely moot since heap profiling slows the code). Here's how we looked at the heap profiler data: ms_print --x=60 small_aggregations/AIRX3_hdf4_aggregation_small_2.dods-bescmd.xml.massif | more Note that the --x=60 limits the ASCII graph to 60 characters width which can be pasted into a Google doc without looking too ugly. Piping through more makes the long profile output more manageable. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Memory tests run on the aggregation files We looked at the heap profiles for four different aggregations: Two for the AIRX 'small' aggregation made using 10 files and two for the TRMM_3A11 aggregation. The bescmd files were (all inside bes/timing/ small_aggregations): File What _______________________________________________________________________ AIRX3_hdf4_aggregation_small.dods-bescmd.xml 14 vars AIRX3_hdf4_aggregation_small_2.dods-bescmd.xml 43 vars TRMM_3A11_hdf4_aggregation_small.ncml.dods-bescmd.xml 1 var TRMM_3A11_hdf4_aggregation_small_2.ncml.dods-bescmd.xml Whole dataset For each of these, we ran the tests on a 'baseline' BES that used libdap 3.14.1 and also an 'improved' libdap (3.14.2) that is more efficient in the way it uses its memory. We did this using two VMs (from the same snapshot), but building from two versions of libdap will work as well. To get the 'baseline' version of libdap, get the code tagged 3.14.1; the improved code is now the master branch. To see our results, look at: https://docs.google.com/a/opendap.org/document/d/1qOWmuhSWEthW2dCMlRcWzbjQsvobMSO1hJAp3RkoqB0/edit?usp=sharing - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - BES Timing Tests - Test Commands - Beneath this directory you will find two subdirectories, 'test' and four directories named for the four aggregations used in the timing and profiling tests. In each there is a collection of files containing BES commands, each one as would be issued by the OLFS to an attached BES listener process. The bes command files are named according to the dataset, the subset, and the response type that is embodied in the request. - Shell Scripts - Also in this directory you will find shell scripts that will run the timing tests and extract a brief summary of the results. -- mkAggTests -- This shell script runs the 4 different sets of aggregation timing test commands in the agg_perf_test directory. The timing summaries are recorded in files named in keeping with each aggregation's command set. NB: This command might work a bit differently given changes to timeTest. It's best to read it over before using. The key thing is to not inadvertently erase any logs you've worked hard to make with subsequent runs of the timing code. -- timeTest -- This script takes as its command line a list of file names that are assumed to contain BES XML requests. - timeTest assumes that the environment variable 'prefix' has been set, and will utilize $prefix to locate the BES configuration file at $prefix/etc/bes/bes.conf - The BES response will be placed in a file named by adding the suffix '.result' to the command file name. - The debug output will be placed in a file named by adding the suffix '.log' to the command file name. - timeTest launches besstandalone like this: besstandalone -c $prefix/etc/bes/bes.conf -d "cerr,timing" -i $command > $resultFile 2>>$logFile - There is a shell variable, 'reps', set at the top of the timeTest script that controls the number of times timeTest will run the command. Depending on the speed of the test execution you may want to adjust that value. NB: I added a command line option (-r) so that this can be set without hacking the script. jhrg - Also added: two more options (in addition to -r) to control the debug command passed to besstandalone: -d <value> adds <value> to the debug symbols and -D which turns off all debugging, including the 'timing' information. Use this to determine the amount of overhead the basic timing code adds. Once timeTest has completed all of the reps, it analyzes the log file and computes the average time it takes for every DDS, DAS, DODS, DMR, and DAP response and prints that summary to stdout. USAGE: Run the fnoc1*.xml commands 10 times (the default reps) and collect data from the 'timing' symbol only (the default debug setting): ./timeTest tests/fnoc1*.xml Run the tests two times: ./timeTest -r 2 tests/fnoc1*.xml: Run them twice Run the tests two time but collect data from 'timing' and 'ncml': ./timeTest -r 2 -d ncml tests/fnoc1*.xml -- timingIndent -- The timingIndent script reads from stdin and writes to stdout. It will look through its input stream for timing log lines and indent them according to how many timing START vs timing STOP lines it encounters. USAGE cat $logFile | ./timingIndent