/pgcluu

PostgreSQL Cluster performances monitoring and auditing tool

Primary LanguagePerlPostgreSQL LicensePostgreSQL

NAME
    pgCluu - PostgreSQL Cluster utilization

DESCRIPTION
    pgCluu is a PostgreSQL performance monitoring and auditing tool.

    It is a Perl program used to perform a full audit of a PostgreSQL
    Cluster and System performance. It is divided in two parts:

    - A collector used to grab statistics on the PostgreSQL cluster using
    the psql command line utility and sar from the sysstat package.

    - A pure Perl grapher that will generate all HTML and charts output
    without any requirements.

    If you don't need system utilization reports or don't want to install
    the sysstat package, you can disable it at command line. You will only
    have reports about your PostgreSQL Cluster. If you are running pgCluu
    from a central server using option -h to monitor remotely a PostgreSQL
    Cluster, the call to sar is automatically disabled.

    If you just want to have system utilization reports or generate graphs
    from a sar data file, it's also possible.

SYNOPSIS
    PostgreSQL and System metrics collector.

            pgcluu_collectd [options] output_dir

    Report generator.

            pgcluu [options] -o report_dir input_dir

REQUIREMENT
    pgCluu comes with two Perl scripts. You need a modern Perl distribution,
    the psql client and the sar command line utility (sysstat). The sysstat
    package is optional, you can still use pgCluu to generate reports about
    your PostgreSQL Cluster without it.

    Charts are rendered using a Javascript library so you don't need
    anything else. Your browser will do all the work.

INSTALLATION
  Installation from package
    Installation of pgCluu can be done through the PostgreSQL Global
    Development Group (PGDG) repositories. See how to install the PGDG
    repositories at the following URLs for Debian and Ubuntu:

            https://wiki.postgresql.org/wiki/Apt

    and Redhat, CentOs, Fedora, Scientific Linux and Oracle Enterprise
    Linux:

            https://yum.postgresql.org/

    Once it is done you can simply install pgCluu with commands:

            sudo apt install pgcluu
            sudo yum install pgcluu

    Look at the package information to know where files are specifically
    installed. See next two chapters to see which default installation paths
    are used.

  Installation from sources
    Download the tarball from GitHub and unpack the archive:

            tar xzf pgcluu-3.x.tar.gz
            cd pgcluu-3.x/
            perl Makefile.PL
            make && sudo make install

    This will copy the Perl scripts pgcluu_collectd and pgcluu into
    /usr/local/bin directory and the man page to
    /usr/local/share/man/man1/pgcluu.1p.gz. Those are the default
    installation directories for 'site' install on some well-known
    distribution but the path could change.

    If you want to install all under /usr location, use INSTALLDIRS='vendor'
    as an argument of Makefile.PL. The script will be installed into
    /usr/bin/pgcluu and the manpage into /usr/share/man/man1/pgcluu.1p.gz.

    For example, to install everything just like Debian does, proceed as
    follows:

            perl Makefile.PL INSTALLDIRS=vendor

    By default INSTALLDIRS is set to site.

    The directory where statistics will be saved is /var/lib/pgcluu/data and
    the default directory for reports is /var/lib/pgcluu/report.

    The CGI script is installed into /var/lib/cgi-bin/pgcluu.cgi and the
    Apache configuration file into /etc/apache/conf-available/pgcluu.conf.
    This file allow acces to resources files (CSS and JS files) from
    installation directory /usr/local/share/pgcluu/rsc/. Access is granted
    to local user only by default.

    All scripts (pgcluu_collectd, pgcluu and pgcluu.cgi) are reading
    configuration file from /usr/local/etc/pgcluu.conf. This file is mainly
    use by the CGI script but some directives are dedicated to
    pgcluu_collectd and pgcluu script to define the retention days for
    example.

  Custom installation
    The installation of pgCluu can be fully customized through environment
    variables (RSCDIR,CGIDIR,CONFDIR,PIDDIR,STATDIR,APACHECONF,MANDIR,
    DOCDIR,SYSTEMDDIR,RETENTION). These variables are passed to Makefile.PL
    as command line arguments or can be exported as environment variables
    before running "perl Makefile.PL".

    The default values for these variables are:

        DESTDIR     => /usr/local
        INSTALLDIRS => site
        CONFDIR     => DESTDIR/etc
        PIDDIR      => /var/run/postgres
        STATDIR     => /var/lib/pgcluu/data
        REPORTDIR   => /var/lib/pgcluu/data
        RSCDIR      => DESTDIR/share/pgcluu
        CGIDIR      => /usr/lib/cgi-bin
        APACHECONF  => /etc/apache2/conf-available
        MANDIR      => DESTDIR/share/man
        DOCDIR      => DESTDIR/share/doc
        SYSTEMDIR   => DESTDIR/lib/systemd/system
        RETENTION   => 0

    If INSTALLDIRS is set to 'vendor':

        CONFDIR     => /etc
        DESTDIR     => /usr

    The configuration file is auto-generated by the Makefile.PL script and
    saved into CONFDIR/pgcluu.conf. If the destination file exists it is not
    overridden. The file is also saved as example in the directory
    DOCDIR/pgcluu/examples/pgcluu.conf.dist

    The directory where pgcluu_collectd will store statistics is defined
    with STATDIR which is by default /var/lib/pgcluu/data. The directory
    where pgcluu will generate static html reports is defined by REPORTDIR.
    The default is /var/lib/pgcluu/report. Both directory must be owned by
    the postgres user.

    The man page is saved as DESTDIR/share/man/pgcluu.1p.gz and a symbolic
    link pgcluu_collectd.1p.gz is created to this file. The documentation,
    README, changelog.gz, LICENSE files are saved under DESTDIR/share/doc/.

    For the CGI mode, the resources (css and js files from the cgi-bin/rsc)
    are saved under the DESTDIR/share/pgcluu/rsc directory. The CGI script
    is saved under /usr/lib/cgi-bin/pgcluu.cgi. The Apache configuration
    file under /etc/apache2/conf-available/pgcluu.conf with a symbolic link
    /etc/apache2/conf-enabled/pgcluu.conf created to this file. Its content:

        Alias /pgcluu RSCDIR/
        <Directory RSCDIR/>
            Options FollowSymLinks MultiViews
            AllowOverride None
            Require local
            #Require ip 192.1.168.0/24
        </Directory>

    The systemd files (pgcluu_collectd.service,pgcluu.service,pgcluu.timer)
    are saved as examples into DOCDIR/pgcluu/examples/ and into the systemd
    directory SYSTEMDDIR/

    The right path to the configuration file is set into all scripts pgcluu,
    pgcluu_collectd and pgcluu.cgi. The path where the pid file must be
    saved is replaced into pgcluu_collectd with the value of PIDFILE
    variable.

    Paths to scripts, pid directory, statistics and reports are replaced in
    all systemd service files following the values of the environment
    variable explain here.

USAGE
   Manually
    See next two chapters for a complete description of the command line
    options. For the impatient, here some simple commands that could be run
    as postgres user:

            mkdir /tmp/stat_db1/
            pgcluu_collectd -D -i 60 /tmp/stat_db1/
            LOG: Detach from terminal with pid: 11323

    or with more options

            pgcluu_collectd -D -i 60 /tmp/stat_db1/ -h 10.10.1.1 -U postgres -d mydb
            LOG: Detach from terminal with pid: 14671

    wait some time and activity on your PostgreSQL Cluster... Then stop the
    pgcluu_collectd daemon and generate the report:

            pgcluu_collectd -k
            LOG: Received terminating signal.
            mkdir /tmp/report_db1/
            pgcluu -o /tmp/report_db1/ /tmp/stat_db1/

    You should obtain something like example at
    http://pgcluu.darold.net/example/

    By default all javascript, css and the webfont fontawesome are
    automatically generated into the output directory if those files does
    not already exits.

   Using systemd unit files
    pgcluu comes with systemd service files:

        - pgcluu_collectd.service: execute pgcluu_collectd as a daemon to collect statistics.
        - pgcluu.service: execute pgcluu to generate reports.
        - pgcluu.timer: run periodically pgcluu.service

    These files are installed automatically by the install script or the
    package into /lib/systemd/system/. To activate these services proceed as
    follow:

            systemctl daemon-reload
            systemctl enable pgcluu_collectd.service
            systemctl enable pgcluu.service
            systemctl enable pgcluu.timer
            systemctl start pgcluu_collectd.service
            systemctl start pgcluu.timer

    Be warn that storing indefinitely statistics with pgcluu_collectd can
    fill you disk space in the short or medium term. You have to remove
    obsolete statistics manually using a cron job or using the embedded
    retention feature by adding option '--retention ndays' to
    pgcluu_collectd call or by changing the value of
    STATS_COLLECTD_RETENTION configuration directive in file
    /usr/local/etc/pgcluu.conf or /etc/pgcluu.conf file following your
    installation. You can also set STATS_REPORT_RETENTION to limit the
    retention of reports directories or use option '--retention ndays' with
    pgcluu. Default is to preserve collected data for the past 30 days.

    If you want to use the CGI mode you also have activate pgcluu caching by
    enabling value STATS_REPORT_CACHING in the configuration file and using
    pgcluu.service+pgcluu.timer. If you don"t want to use systemd you can
    simply run pgcluu script with the -C option and execute it periodically
    through a cron job.

    Note that the systemd service files set the privilege to the data and
    report directory to user postgres and group www-data. By this way
    reports are readable through an httpd server for static reports and data
    are readable through the CGI if enabled. Command executed by the
    pgcluu_collectd.service file (daemon part) are:

            /bin/mkdir -p $STATDIR
            /bin/chown postgres:www-data $STATDIR
            /bin/chmod u=rwX,g=rsX,o= $STATDIR

    The pgcluu.service file (client part) execute commands:

            /bin/mkdir -p $REPORTDIR
            /bin/chown postgres:www-data $REPORTDIR
            /bin/chmod u=rwX,g=rsX,o= $REPORTDIR

    In case you don't want to use systemd service files you will have to
    execute these commands manually before tunning pgCluu.

    For pgcluu reports, the output directory can be defined in the
    configuration file using the STATS_REPORT_OUTDIR directive.

COLLECTING STATISTICS
    To generate reports about your PostgreSQL Cluster Utilization you must
    collect statistics before. pgcluu_collectd is here for that. It can be
    run in a daemon mode (option -D) or in interactive mode for debugging
    purpose. All you need is to provide a directory where data will be
    stored. Statistics will be pooled at a default interval of 60 seconds,
    using option -i you can customize it. See below for a complete list of
    command line options.

  pgcluu_collectd usage
    usage: pgcluu_collectd [options] output_dir

            output_dir: full path to directory where pgcluu_collectd will
                        store statistics.

    options:

      -B, --enable-buffercache enable buffercache statistics if pg_buffercache
                               extension is installed.
      -c, --capture            create a snapshot of the PostgreSQL installation
                               into tmp/pgcluu_capture.tar.gz.
      -C, --end-counter=NUM    terminate program after NUM reports.
      -d, --dbname=DATABASE    database name to connect to. Default to current user.
      -D, --daemonize          detach from console and enter in daemon mode.
      -E, --end-after=NUM      self terminate the program after a given number of
                               seconds. Can be written: 7200 or 120M or 2H, for
                               days use 7D for example to stop collecting data
                               after seven days.
      -f, --pid-file=FILE      path to pid file. Default: /var/run/postgresql/pgcluu_collectd.pid.
      -h, --host=HOSTNAME      database server host or socket directory
      -i, --interval=NUM       time to wait between runs
      -k, --kill               stop current pgcluu_collectd running daemon.
      -m, --metric=METRIC      set a coma separated list of metrics to perform.
      -M, --max-size=SIZE      self terminate program when the size of the output
                               directory exceed a given size. Can be written: 2GB
                               or 2000MB.
      -p, --port=PORT          database port(s) to connect to. Defaults to 5432.
      -P, --psql=BIN           path to the psql command. Default: psql.
      -Q, --no-statement       do not collect statistics from pg_stat_statements.
      -r, --rotate-daily       force daily rotation of data files.
      -R, --rotate-hourly      force hourly rotation of data files.
      -s, --sar=BIN            path to sar sysstat command. Default: sar.
      -S, --disable-sar        disable collect of system statistics with sar.
      -t, --lock-timeout=NUM   terminate metric SQL query after N second in case it
                               wait too much time because of a lock. Default: 3
      -T, --no-tablespace      disable lookup at tablespace when the connect user
                               is not superuser to avoid printing an error message.
      -U, --dbuser=USERNAME    database user to connect as. Default to current user.
      -v, --verbose            Print out debug informations.
      -V, --version            Show pgcluu_collectd version and exit.
      -w, --no-waitevent       don't collect wait event stats from pg_wait_sampling.
      -W, --password=pass      database password.
      -z, --compress           force compression of rotated data files.
      --included-db=DATABASE   collect statistics only for those databases present
                               in a comma separated list of database names.
      --list-metric            list available metrics actions that can be performed.
      --sysinfo                get operating system infos and exit (sysinfo.txt).
      --no-sysinfo             do not collect operating system information at all.
      --no-database            do not collect database statistics at all.
      --pgbouncer-args=OPTIONS Option to used to connect to the pgbouncer system
                               database. Ex: -p 6432 -U postgres -h 192.168.1.100
                               You must at least give one parameter to enable
                               pgbouncer monitoring.
      --sar-file=FILE          path to sar output data file for sysstat stats
                               Default to output_dir/sar_stats.dat.
      --stat-type all|user     Set stats tables to read. Values: 'all' or 'user' to
                               look at pg_stat_(all|user) tables. Default: user.
      --pgversion X.Y          force the PostgreSQL version to the given value.
      --pgservice NAME         Name of service inside of the pg_service.conf file.
      --exclude-time RANGE     exclude a laps of time by giving the start and end
                               hours.
      --cron-user=USERNAME     collect crontab settings for the given username (in
                               this case pgcluu_collectd need to be run as root).
                               Default is to use USERNAME environment variable or
                               postgres when it is not defined.
      --package-list=CMD       command to list PostgreSQL packages. Default is to
                               autodetect package type and using command 'rpm -qa'
                               or 'dpkg -l'. If you have an other system you can
                               set a custom command. A filter on keyword 'postgres'
                               is appended to the command: ' | grep postgres'.
      --retention NDAYS        number of rolling days to keep in data directory in
                               incremental mode. Default is to store indefinitely.
      --disable-pidstat        do not collect metrics from pidstat command.
      --help                   print usage

    Use those options to execute sar on the remote host defined by the -h
    option, otherwise it will be executed locally:

      --enable-ssh             activate the use of ssh to run sysstat remotely.
      --ssh-program ssh        path to the ssh program to use. Default: ssh.
      --ssh-user username      connection login name. Default to running user.
      --ssh-identity file      path to the identity file to use.
      --ssh-timeout second     timeout to ssh connection failure. Default 10.
      --ssh-options  options   list of -o options to use for the ssh connection.
                               Options always used:
                                 -o ConnectTimeout=$ssh_timeout
                                 -o PreferredAuthentications=hostbased,publickey

    For example, as postgres user to monitor locally a full PostgreSQL
    cluster:

      mkdir /tmp/stat_db1/
      pgcluu_collectd -D -i 60 /tmp/stat_db1/

    to collect statistics from pgbouncer too, and limit database statistics
    to a single database:

      pgcluu_collectd -D -i 60 /tmp/stat_db1/ -h 10.10.1.1 -U postgres -d mydb 
                    --pgbouncer-args='-p 5342'

    to disable statistics collect between 22:30 and 06:30 the next day:

      pgcluu_collectd -D -i 60 /tmp/stat_db1/ --exclude-time "22:30-06:30"

    to collect statistics from a remote server:

        pgcluu_collectd -D -i 60 /tmp/statdb1/ -h 10.0.0.1 -U postgres --disable-sar

    the same but with collecting system statistics using remote sar calls:

      pgcluu_collectd -D -i 60 /tmp/statdb1/ -h 10.0.0.1 -U postgres --enable-ssh 
            --ssh-user postgres --ssh-identity /var/lib/postgresql/.ssh/id_rsa.pub

    You may need a .pgpass and be able to establish passwordless ssh
    connections to be able to collect statistics from remote hosts.

    Then after some time and activities on the database, stop the daemon as
    follow:

      pgcluu_collectd -k

    or by sending sigterm to the pgcluu_collectd's pid.

    You can run the collector in incremental mode using a daily or a hourly
    statistics rotation:

      pgcluu_collectd -D -i 60 /tmp/stat_db1/ --rotate-daily

    On a server with huge activity you may want to use --rotate-hourly and
    compression mode with --compress option. If you have limited disk space
    you can restrict the retention time of statistics files using option
    --retention with the storage day limit.

  Statistics files
    The output directory with all statistics collected should look likes:

            /tmp/stat_db1/
            |-- commit_memory.csv
            |-- end-pg_statio_user_indexes.csv
            |-- end-pg_statio_user_sequences.csv
            |-- end-pg_statio_user_tables.csv
            |-- end-pg_stat_user_functions.csv
            |-- end-pg_stat_user_indexes.csv
            |-- end-pg_stat_user_tables.csv
            |-- end-pg_stat_xact_user_functions.csv
            |-- end-pg_stat_xact_user_tables.csv
            |-- fs_stat_use.csv
            |-- pg_class_size.csv
            |-- pg_database_size.csv
            |-- pg_db_role_setting.csv
            |-- pg_hba.conf
            |-- pg_ident.conf
            |-- pg_nondefault_settings.csv
            |-- pg_prepared_xact.csv
            |-- pg_settings.csv
            |-- pg_stat_archiver.csv
            |-- pg_stat_bgwriter.csv
            |-- pg_stat_connections.csv
            |-- pg_stat_count_indexes.csv
            |-- pg_stat_database_conflicts.csv
            |-- pg_stat_database.csv
            |-- pg_stat_hash_indexes.csv
            |-- pg_stat_invalid_indexes.csv
            |-- pg_statio_user_indexes.csv
            |-- pg_statio_user_sequences.csv
            |-- pg_statio_user_tables.csv
            |-- pg_stat_locks.csv
            |-- pg_stat_missing_fkindexes.csv
            |-- pg_stat_redundant_indexes.csv
            |-- pg_stat_replication.csv
            |-- pg_stat_unlogged.csv
            |-- pg_stat_unused_indexes.csv
            |-- pg_stat_user_functions.csv
            |-- pg_stat_user_indexes.csv
            |-- pg_stat_user_tables.csv
            |-- pg_stat_xact_user_functions.csv
            |-- pg_stat_xact_user_tables.csv
            |-- pg_tablespace_size.csv
            |-- pg_xlog_stat.csv
            |-- postgresql.auto.conf
            |-- postgresql.conf
            |-- sar_stats.dat
            |-- sysinfo.txt

    Then now you can proceed with pgcluu to generate reports.

  Additional statistics from extensions
    To be able to report top statements, wait events and subxact statistics
    you need to create the related extensions into a single database and use
    this database for primary connection to pgcluu_collectd using the -d
    option.

    The related extension are pg_stat_statements, pg_wait_sampling and
    pg_subxact_counters.

  Rotation and compression
    When used the --rotate-daily or --rotate-hourly commands line option
    will force pgcluu_collectd to rotate daily or hourly all statistic's
    files. In this case, statistics files will be created in a subdirectory
    based on rotation frequency, output_dir/year/month/day[/hour]. This is
    called the incremental mode.

    To save filesystem space it is possible to enable compression of all
    rotated files during the rotation process. Just activate the -z or
    --compress command line option. You can also use the --retention option
    to set the storage time limit in days.

  Incremental mode
    This mode is enabled when --rotate-daily or --rotate-hourly command line
    options are used. It allow pgcluu to build reports incrementally by days
    or hours. In this mode you don't have to build reports per day or hour
    pgcluu will do the work automatically, just give it the top statistics
    directory.

            pgcluu -o /var/www/pgcluu/reports/ /var/lib/pgcluu/data/

    pgcluu will detect that --rotate-daily or --rotate-hourly have been used
    to collect data and generate reports for each day or hours stored.
    Previous directories already processed will not be processed again
    unless this was the the last one.

    The daily or hourly basis statistic storage also allow the use of the
    CGI script pgcluu.cgi to have dynamic reports and temporal search. See
    "Using dynamic mode" chapter.

    When used the --rotate-daily or --rotate-hourly commands line option
    will force pgcluu_collectd to rotate daily or hourly all statistic's
    files. In this case, statistics files will be created in a subdirectory
    based on rotation frequency, output_dir/year/month/day[/hour]. This is
    called the incremental mode.

    To save filesystem space it is possible to enable compression of all
    rotated files during the rotation process. Just activate the -z or
    --compress command line option. You can also use the --retention option
    to set the storage time limit in days.

  Capture mode
    The goal of this mode is to be able to obtain a simple report about the
    PostgreSQL installation without collected metrics others than database
    and tablespace size. This report can be use by ITs to better understand
    the configuration and things that need to be tuned.

    To enable this mode, just run pgcluu_collectd with the single option -c
    or --capture. Other command line options will not be taken in account.
    pgcluu_collectd will create e temporary directory /tmp/pgcluu_capture to
    store temporary data and will removed if after building a compressed tar
    archive: /tmp/pgcluu_capture.tar.gz. This is this archive that can be
    used with pgcluu to build a snapshot report of the instance. pgcluu will
    automatically detect this mode.

GENERATING REPORTS
  Static HTML reports
    To generate a pgCluu report about a PostgreSQL Cluster you must, at
    least, have a directory that contains all data files generated by
    pgcluu_collectd or pgstats. In this directory, if you have a file named
    sar_stats.dat or sadc_stats.dat for binary sadc data file, it will be
    taken to build report about system utilization. If you just want to make
    a report from a sar file use the -i or -I options.

    usage: pgcluu [options] [-i sar_file | -I sadc_file] [input_dir]

            input_dir: directory where pgcluu_collectd or pgstats and sar data
                       files are stored.

    options: -b, --begin datetime start date/time for the data to be parsed
    (time of current timezone). -C, --cache generate cache files only
    (.bin), no html output. -d, --db-only dbname only report for the whole
    cluster and the given database name. You can use it multiple time or
    give a comma separated list of database name. -D, --device-only dev only
    report I/O stats for a particular device You can use it multiple time or
    give a comma separated list of device name, ex: sda,sdc. -e, --end
    datetime end date/time for the data to be parsed (time of current
    timezone). -i, --sar-file=FILE path to the sar text data file to read to
    generate system reports. Default to input_dir/sar_stats.dat. -I,
    --sadc-file=FILE sadc binary data file to read to generate system
    reports. Default to input_dir/sadc_stats.dat. -n, --top-number Top
    number of tables or indexes I/O stats to show. Default is set to top 10.
    Set it to 0 to show all. -N, --network-only iface only report stats for
    a particular network interface. You can use it multiple times or give a
    comma separated list of network interfaces, ex: eth0,eth1. -o,
    --output=DIR output directory -r, --reverse-date By default pgcluu look
    at mm/dd/yy format in sar file. When enabled pgcluu will look at
    dd/mm/yy format. -s, --sadf=BIN path to the sadf sysstat command used to
    read the sadc binary data file. Default: /usr/bin/sadf. -S,
    --disable-sar disable collect of system statistics with sar. -t,
    --with-table table Only report for the whole tables and the given table
    name. You can use it multiple time or give a comma separated list of
    database name. -T, --no-table Do not report statistics related to
    tables. -v, --verbose Print out debug informations. -V, --version Show
    pgcluu version and exit. -x, --external-menu Save menu in menu.html and
    load it into each report using w3-include-html attribut from w3.js. This
    will only work if acces to HTML reports is through a Web server, not
    using the file:// protocol. -z, --timezone +/-XX Set the number of
    hour(s) from GMT of the timezone, Eg: -z +02 Usually autodetected. Use
    this to adjust date/time from the sar output, pgcluu use GMT time to
    draw charts. -Z, --stats-timezone +/-XX Set the number of hour(s) from
    GMT of the timezone. Eg: -Z +02 Usually autodetected. Use this to adjust
    date/time from the cluster and system stats output, pgcluu use GMT time.
    --from-sa-file instruct pgcluu that file specified by the -i option uses
    the standard system activity daily data file. --charset used to set the
    HTML charset to be used. Default: utf-8. --retention NDAYS number of
    rolling days to keep in report directory. Default is to store
    indefinitely. --help print usage

    For example, you can generate all HTML reports from data files stored
    into /tmp/stat_db1/ with the following commands:

            mkdir /tmp/report_db1/
            pgcluu -o /tmp/report_db1/ /tmp/stat_db1/

    If you just want reports of some databases, use the following:

            pgcluu -o /tmp/report_db1/ /tmp/stat_db1/ --db-only "db1,db2,db3"

    If you just want to create a report from a sar output file:

            sar -p -A 10 60 > /root/my_sar_file.txt
            pgcluu -o /tmp/report_sar/ -i /root/my_sar_file.txt

    or from a daily sa file:

            sar -p -A -f /var/log/sa/sa18 > /root/my_sar_file.txt
            pgcluu -o /tmp/report_sar/ -i /root/my_sar_file.txt --from-sa-file

    and from a sa binary file:

            pgcluu -o /tmp/report_sar/ -i /var/log/sysstat/sa22

    or the sa text file if you don't have the same version of sysstat:

            pgcluu -o /tmp/report_sar/ -i /var/log/sysstat/sar23 --from-sa-file

    If pgcluu_collectd have been run in incremental mode you can limit the
    number of retention days used for the reports:

            pgcluu -o /tmp/report_sar/ /tmp/stat_db1/ --retention 30

    A static report will be built for each day or hour following the
    rotation used with pgcluu_collectd.

  Dynamic reports (CGI)
    Dynamics reports are build by a CGI script named pgcluu.cgi that can be
    found in the cgi-bin repository of the source code. It allow you to
    select the time period to build reports and to look at all differents
    reports for this period just as with static HTML reports. After
    installation from sources or binary packages the CGI might be found in
    /usr/lib/cgi-bin/pgcluu.cgi or /var/www/cgi-bin/pgcluu.cgi following
    your distribution.

    To use pgCluu in CGI mode, you need a Web server (here we use Apache)
    and a cron task to build the cache periodically.

    First enable CGI mode.

            sudo a2enmod cgi
            sudo service apache2 restart

    Then the CGI need to find the statistics directory where pgcluu_collectd
    mostly /var/lib/pgcluu/data/. The content of this repository must be
    readable by the Apache user (www-data).

    Take care to restrict access to your server and the CGI as information
    about your database and server are exposed in the reports.

    The CGI script, pgcluu.cgi, use resources files (CSS and javascript).
    They are stored in the cgi-bin/rsc/ directory of the source
    distribution. Install this repository onto the DocumentRoot of your Web
    server, for example:

            sudo mkdir /var/www/pgcluu/
            sudo cp -rf cgi-bin/rsc /var/www/pgcluu/

    Then edit /etc/pgcluu.conf, copy it from sources cgi-bin/pgcluu.conf if
    it doesn't exists. Change the RSC_BASE and INPUT_DIR configuration
    directive to match your installation. Here in our example:

            RSC_BASE        /pgcluu/rsc/
            INPUT_DIR       /var/lib/pgcluu/data

    Now we can start the data collection. For the moment pgcluu.conf is only
    read by the CGI pgcluu.cgi but this might change in the future.

    To let pgCluu start collecting data, you can use the following command
    (you can change it to however you like). We will need postgres user for
    this.

            sudo su - postgres
            /usr/local/bin/pgcluu_collectd -D -i 60 --rotate-daily /var/lib/pgcluu/data

    This will rotate you data daily. This is also the default in the systemd
    service file.

    Plugging the CGI directly to the CSV statistics files will result in
    very slow generation reports. To improve speed caching must be used, you
    must execute periodically pgcluu in cache mode. Run it manually the
    first time

            /usr/local/bin/pgcluu --cache /var/lib/pgcluu/data

    then add a cron task to execute the command each five or ten minutes:

            */5 * * * * /usr/local/bin/pgcluu --cache /var/lib/pgcluu/data

    If you are using systemd pgcluu.timer service then caching can be
    enabled using STATS_REPORT_CACHING directive. Enabling this mode
    generate cache files (*.bin) in the statistics directory and disable
    static HTML reports generation.

    To see your reports, go to the following URL:

            http://localhost/cgi-bin/pgcluu.cgi

    Or change localhost by your fqdn server name.

    Note that here pgcluu_collectd and pgcluu scripts are found in
    /usr/local/bin/ which is the default for an installation from sources
    but with an installation from binary package you might find them into
    /usr/bin/.

    The CGI mode allow you to select the period of time used to generate the
    statistics reports. Using default pgcluu_collectd 60 seconds interval, a
    daily graph will have 1440 points. Having too much timeseries to render
    is not a good point for performances so we are limiting to the first
    seven days of the selected period. This mean a maximum of 10080 points
    for one week, this seems a safe limit. Feel free to increase or reduce
    the limit following the server and client resources. This is controlled
    by

            MAX_RENDERED_DAYS       7

    configuration directive in pgcluu.conf

LICENSE
    Copyright (c) 2012-2023, Gilles Darold

    pgCluu is licenced under the PostgreSQL Licence a liberal Open Source
    license, similar to the BSD or MIT licenses. That mean that all parts of
    the program are open source and free of charge.

            Permission to use, copy, modify, and distribute this software and its
            documentation for any purpose, without fee, and without a written agreement
            is hereby granted, provided that the above copyright notice and this
            paragraph and the following two paragraphs appear in all copies.

            IN NO EVENT SHALL Dalibo BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT,
            SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS,
            ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF
            Dalibo HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

            Gilles Darold SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT
            LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
            PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS,
            AND Gilles Darold HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT,
            UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

    This is the case for both, pgcluu_collectd and the grapher pgcluu
    programs.

AUTHORS
    pgCluu is an original development of Gilles Darold.

    Some parts of the collector are taken from pgstats a C program writen by
    Guillaume Lelarge and especially the SQL queries including the
    compatibility with all PostgreSQL versions. See
    https://github.com/gleu/pgstats

    Btw pgCluu grapher is compatible with files generated by pgstats, sar
    and sadc so you can use it independantly to graph those data. Some part
    of the sar output parser are taken from SysUsage. See
    http://sysusage.darold.net/