Contains tools for analyzing time-series data.
The analyze_series command is meant to be used in the command-line in typical *nix fashion, expecting:
- one number per line
- input on standard input or as a file argument to analyze_series
analyze_series will analyze the points in the provided series and print the following descriptive statistics:
- median
- mean, sample standard deviation
- upper and lower control limits
- points and associated values falling outside of control limits
analyze_series can be used to help implement a run-by-run control program using Control Charts. analyze_series currently supports 'Rule 1' testing for points that are +/- 3-sigma from the series' mean. Support for detecting suspicious runs above/below the mean is planned (issue #3).
execute analyze_series -h for full usage details.
excerpt:
usage: analyze_series [-h] [-v] [--assert-last-point-in-control] [input_file] positional arguments: input_file input file to process, defaults to standard input optional arguments: -h, --help show this help message and exit -v, --verbose print additional debug information --assert-last-point-in-control assert the last point in the series is in-control and exit with status code '2' if last point is not in control
Analyze a file on the filesystem:
$ ./analyze_series resources/examples/multiple_control_limit_violation.txt median: 0.0 mean: 0.0357142857143 std dev: 1.41375432756 lower control limit: -4.20554869697 upper control limit: 4.2769772684 points outside of lcl: [(3, -5.0)] points outside of ucl: [(28, 7.0)]
Analyze data arriving on standard input:
$ cat resources/examples/under_control.txt | ./analyze_series median: 10.05 mean: 10.05 std dev: 0.873689494805 lower control limit: 7.42893151558 upper control limit: 12.6710684844 points outside of lcl: None points outside of ucl: None
Analyze data and return error if last point is not in control:
$ cat resources/examples/upper_control_limit_violation.txt | ./analyze_series --assert-last-point-in-control; echo "exit status: $?" median: 0.0 mean: 0.142857142857 std dev: 1.11269728053 lower control limit: -3.19523469873 upper control limit: 3.48094898444 points outside of lcl: None points outside of ucl: [(27, 4.0)] stderr: last point (index=27, value=4.0) is out of control exit status: 1
The time-series-analysis project builds-upon and is made-easy by the wonderful work of:
- Python
- NumPy - the fundamental package for scientific computing with Python
- pandas - provides high-performance, easy-to-use data structures and data analysis tools for Python
Thank you!