Dummer

Dummer is a set of tools to generate dummy log data. I made this for Fluentd benchmark.

This gem includes three executable commands

dummer
dummer_simple
dummer_yes

Installation

Add this line to your application's Gemfile:

gem 'dummer'

And then execute:

$ bundle

Or install it yourself as:

$ gem install dummer

Run as

$ dummer -c dummer.conf
$ dummer_simple [options]
$ dummer_yes [options]

dummer

dummer allows you to

specify a rate of generating messages per second,
determine a log format, and
generate logs randomly

Usage (1) - Write to a file

Create a configuration file. A sample configuration is as follows:

# dummer.conf
configure 'sample' do
  output "dummy.log"
  rate 500
  delimiter "\t"
  labeled true
  field :id, type: :integer, countup: true, format: "%04d"
  field :time, type: :datetime, format: "[%Y-%m-%d %H:%M:%S]", random: false
  field :level, type: :string, any: %w[DEBUG INFO WARN ERROR]
  field :method, type: :string, any: %w[GET POST PUT]
  field :uri, type: :string, any: %w[/api/v1/people /api/v1/textdata /api/v1/messages]
  field :reqtime, type: :float, range: 0.1..5.0
  field :foobar, type: :string, length: 8
end

Running

$ dummer -c dummer.conf

Outputs to the dummy.log (specified by output parameter) file like:

id:0422  time:[2013-11-19 02:34:58]  level:INFO  method:POST uri:/api/v1/textdata  reqtime:3.9726677258569842  foobar:LFK6XV1N
id:0423  time:[2013-11-19 02:34:58]  level:DEBUG method:GET  uri:/api/v1/people    reqtime:0.49912949125272277 foobar:DcOYrONH
id:0424  time:[2013-11-19 02:34:58]  level:WARN  method:POST uri:/api/v1/textdata  reqtime:2.930590441869852   foobar:XEZ5bQsh

Usage (2) - Post to Fluentd process

(experimental)

Create a configuration file. Assume that a fluentd process is running on localhost:24224. A sample configuration is as follows:

# dummer.conf
configure 'sample' do
  host "localhost" # define `host` and `port` instead of `output`
  port 24224
  rate 500
  tag type: :string, any: %w[raw.syslog raw.message raw.nginx] # configure tag
  field :id, type: :integer, countup: true, format: "%04d"
  field :level, type: :string, any: %w[DEBUG INFO WARN ERROR]
  field :method, type: :string, any: %w[GET POST PUT]
  field :uri, type: :string, any: %w[/api/v1/people /api/v1/textdata /api/v1/messages]
  field :reqtime, type: :float, range: 0.1..5.0
  field :foobar, type: :string, length: 8
end

Running

$ dummer -c dummer.conf

Data is posted to fluentd process like (below is the fluentd log generated by out_stdout)

2014-01-31 00:55:32 +0900 raw.message: {"id":"1377","level":"INFO","method":"POST","uri":"/api/v1/people","reqtime":1.678867810409548,"foobar":"paOIWxhQ"}
2014-01-31 00:55:32 +0900 raw.syslog: {"id":"1378","level":"INFO","method":"GET","uri":"/api/v1/people","reqtime":4.8412816521873445,"foobar":"kUvnC0MK"}
2014-01-31 00:55:32 +0900 raw.message: {"id":"1379","level":"WARN","method":"GET","uri":"/api/v1/people","reqtime":3.584494903998221,"foobar":"KD78mpjX"}

CLI Options

You can specify some configuration parameters on CLI without writing them on a configuration file.

$ dummer help start
Usage:
  dummer start

Options:
  -c, [--config=CONFIG]            # Config file
                                   # Default: dummer.conf
  -r, [--rate=N]                   # Number of generating messages per second
  -o, [--output=OUTPUT]            # Output file
  -h, [--host=HOST]                # Host of fluentd process
  -p, [--port=N]                   # Port of fluentd process
  -m, [--message=MESSAGE]          # Output message
  -d, [--daemonize]                # Daemonize. Stop with `dummer stop`
  -w, [--workers=N]                # Number of parallels
      [--worker-type=WORKER_TYPE]
                                   # Default: process
  -p, [--pid-path=PID_PATH]
                                   # Default: dummer.pid

Configuration Parameters

Following parameters in the configuration file are available:

output

Specify a filename to output, or IO object (STDOUT, STDERR)
host

Post a data to a fluentd process on the specified host. Either of output or host can be specified.
port

Post a data to a fluentd process on the specified post. Default is 24224.
rate

Specify how many messages to generate per second. Default: 500 msgs / sec
workers

Specify number of processes for parallel processing.
delimiter

Specify the delimiter between each field. Default: "\t" (Tab)
labeled

Whether add field name as a label or not. Default: true
label_delimiter

Specify the delimiter between the label and the value. Default: ":" (column)
tag

Define tag field to generate. This is effective only for posting data to fluentd process with host and port.
field

Random field generator mode. Define data fields to generate. message and input options are ignored. See also Field Data Types section below.
message

Specific message generation mode. See message.conf as an example. This mode works pretty fast because it does not require to generate values randomly.
input

Messages taken from an input file mode. Use this if you want to write messages by reading lines of an input file in rotation. message option is ignored. See input.conf as an example. This mode also works fast.

Field Data Types

You can specify following data types to your tag and field parameters:

:datetime
- :format
  
  You can specify format of datetime as %Y-%m-%d %H:%M:%S. See Time#strftime for details.
- :random
  
  Generate datetime randomly. Default: false (Time.now)
- :value
  
  You can specify a fixed Time object.
:string
- :any
  
  You can specify an array of strings, then the generator picks one from them randomly
- :length
  
  You can specify the length of string to generate randomly
- :value
  
  You can specify a fixed string
:integer
- :format
  
  You can specify a format of string as %03d.
- :range
  
  You can specify a range of integers, then the generator picks one in the range (uniform) randomly
- :countup
  
  Generate countup data. Default: false
- :value
  
  You can specify a fixed integer
:float
- :format
  
  You can specify a format of string as %03.1f.
- :range
  
  You can specify a range of float numbers, then the generator picks one in the range (uniform) randomly
- :value
  
  You can specify a fixed float number

dummer_simple

I created a simple version of dummer since dummer could not achieve the maximum system I/O throughputs because of its rich features. This simple version, dummer_simple could achieve the system I/O limit in my environment.

Sorry, but this simple script cannot post data to fluentd process, supports only writing to a file.

Usage

$ dummer_simple [options]

Options

Usage:
  dummer_simple

Options:
      [--sync]             # Set `IO#sync=true`
  -s, [--second=N]         # Duration of running in second
                           # Default: 1
  -p, [--parallel=N]       # Number of processes to run in parallel
                           # Default: 1
  -o, [--output=OUTPUT]    # Output file
                           # Default: dummy.log
  -i, [--input=INPUT]      # Input file (Output messages by reading lines of the file in rotation)
  -m, [--message=MESSAGE]  # Output message
                           # Default: time:2013-11-20 23:39:42 +0900    level:ERROR     method:POST     uri:/api/v1/people      reqtime:3.1983877060667103

dummer_yes

I created a wrapped version of yes command, dummer_yes, to confrim that dummer_simple achieves the maximum system I/O throughputs.

I do not use dummer_yes command anymore because I verified that dummer_simple achieves the I/O limit, but I will keep this command so that users can do verification experiments with it.

Usage

$ dummer_yes [options]

Options

Usage:
  dummer_yes

Options:
  -s, [--second=N]         # Duration of running in second
                           # Default: 1
  -p, [--parallel=N]       # Number of processes to run in parallel
                           # Default: 1
  -o, [--output=OUTPUT]    # Output file
                           # Default: dummy.log
  -m, [--message=MESSAGE]  # Output message
                           # Default: time:2013-11-20 23:39:42 +0900  level:ERROR method:POST uri:/api/v1/people  reqtime:3.1983877060667103

Relatives

There is a fluent-plugin-dummydata-producer, but I wanted to output dummy data to a log file, and I wanted a standalone separated tool to do benchmark.

Fluentd のベンチマークテストに使える dummer (旧称 dummy_log_generator)

ToDO

write tests

Contributing

Fork it
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Licenses

See LICENSE.txt

sonots/dummer

Dummer

Installation

dummer

Usage (1) - Write to a file

Usage (2) - Post to Fluentd process

CLI Options

Configuration Parameters

Field Data Types

dummer_simple

Usage

Options

dummer_yes

Usage

Options

Relatives

Related Articles

ToDO

Contributing

Licenses