/sample

Filter lines from standard input according to some probability, with a given delay, and for a certain duration.

Primary LanguagePythonMIT LicenseMIT

sample-stream

Build status Python Version Dependencies Status

Code style: black Security: bandit Pre-commit Semantic Versions License

sample is a Python package that allows you to filter lines from standard input according to some probability, with a given delay, and for a certain duration.

Installation

You can install sample-stream as follows:

$ python -m pip install sample-stream

This will install an executable sample in ~/.local/bin. You probably want to either (1) add this directory to your PATH, (2) create a symlink to this executable in a directory which already is on your PATH, or (3) use an alias.

Example

The following command samples lines with a probability of 0.01, with a delay of 1000 milliseconds in between lines, for 5 seconds.

$ time seq -f "Line %g" 1000000 | sample -r 1% -d 1000 -s 5
Line 71
Line 250
Line 305
Line 333
Line 405
Line 427
seq -f "Line %g" 1000000  0.01s user 0.00s system 0% cpu 5.092 total
sample -r 1% -d 1000 -s 5  0.06s user 0.02s system 1% cpu 5.091 total

Help

$ sample --help
usage: sample-stream [-h] [-W WEEKS] [-D DAYS] [-H HOURS] [-m MINUTES]
                     [-s SECONDS] [-t MILLISECONDS] [-u MICROSECONDS]
                     [-r RATE] [-d DELAY]
                     [FILE]

Filter lines from standard input according to some probability, with a
given delay, and for a certain duration.

positional arguments:
  FILE                  File

optional arguments:
  -h, --help            show this help message and exit
  -W WEEKS, --weeks WEEKS
                        Duration of sampling in weeks
  -D DAYS, --days DAYS  Duration of sampling in days
  -H HOURS, --hours HOURS
                        Duration of sampling in hours
  -m MINUTES, --minutes MINUTES
                        Duration of sampling in minutes
  -s SECONDS, --seconds SECONDS
                        Duration of sampling in seconds
  -t MILLISECONDS, --milliseconds MILLISECONDS
                        Duration of sampling in milliseconds
  -u MICROSECONDS, --microseconds MICROSECONDS
                        Duration of sampling in microseconds
  -r RATE, --rate RATE  Rate between 0 and 1 using either 0.33, 33%,
                        1/3 notation.
  -d DELAY, --delay DELAY
                        Time in milliseconds between each line of
                        output

License

License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

Citation

@software{sample-stream,
  author = {Jeroen H.M. Janssens},
  title = {{sample-stream} -- Sample lines from a stream},
  year = {2021},
  url = {https://github.com/jeroenjanssens/sample-stream},
  version = {0.2.5}
}

Credits

This project was generated with python-package-template.