/NFSping

command line tool to check NFS server response times

Primary LanguageCOtherNOASSERTION

$ nfsping -c 5 filer1
filer1 : [0], 0.52 ms (0.52 avg, 0% loss)
filer1 : [1], 0.35 ms (0.43 avg, 0% loss)
filer1 : [2], 0.36 ms (0.41 avg, 0% loss)
filer1 : [3], 1.18 ms (0.60 avg, 0% loss)
filer1 : [4], 0.33 ms (0.55 avg, 0% loss)

filer1 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.33/0.55/1.18


NFSping is a command line utility for measuring the response time of an NFS server. It's basically a copy of the fping interface but doesn't share any code with that project.

On modern NFS servers, the network stack and filesystem are often being run on separate cores or even hardware components. This means in practise that a fast ICMP ping response isn't indicative of how quickly the NFS filesystem is responding. This tool more directly tests the responsiveness of the server's operating system's NFS component.

NFSping checks if the server is responding by sending an NFS NULL RPC and waiting for a response. The NULL procedure is a noop that is designed for testing. NFSping supports versions 2 and 3 of NFS. By default it doesn't use the RPC portmapper for NFS pings and defaults to checking port 2049 using the UDP transport.

NFSping can also check the mount protocol response using the -n option. It will use the portmapper for pinging the mount protocol since that doesn't use a standard port, or a specific port can be specified using the -P option.

To build the nfsping executable:

$ git clone git://github.com/mprovost/NFSping.git
$ cd NFSping && make

The program can then be copied to a location like /usr/local/bin.

In its most basic form it simply reports whether the server is responding:

$ nfsping filer1
filer1 is alive

and exits with a return status of 0, or:

$ nfsping filer1
filer1 : nfsproc3_null_3: RPC: Unable to receive; errno = Connection refused
filer1 is dead

and exiting with a status of 1. This simple form of the command can be built into scripts which just check if the server is up or not without being concerned about a particular response time.

To measure response time, pass the number of requests to send as an argument to the -c (count) option:

$ nfsping -c 5 filer1
filer1 : [0], 0.09 ms (0.09 avg, 0% loss)
filer1 : [1], 0.16 ms (0.12 avg, 0% loss)
filer1 : [2], 0.15 ms (0.13 avg, 0% loss)
filer1 : [3], 0.16 ms (0.14 avg, 0% loss)
filer1 : [4], 0.12 ms (0.14 avg, 0% loss)

filer1 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.09/0.14/0.16

Or to send a continuous sequence of pings like the traditional ICMP ping command use the -l (loop) option:

$ nfsping -l filer1

To exit early, use control-c.

It also has a form that produces more easily parsed output when using the -C option:

$ nfsping -C 5 filer1
filer1 : [0], 1.96 ms (1.96 avg, 0% loss)
filer1 : [1], 0.11 ms (1.04 avg, 0% loss)
filer1 : [2], 0.12 ms (0.73 avg, 0% loss)
filer1 : [3], 0.16 ms (0.59 avg, 0% loss)
filer1 : [4], 0.18 ms (0.51 avg, 0% loss)

filer1 : 1.96 0.11 0.12 0.16 0.18

Missed responses are indicated with a dash (-) in the summary output. This form uses more memory since it stores all of the results. In all forms memory is allocated during startup so there should be no increase in memory consumption once running.

To only show the summary line, use the -q (quiet) option.

NFSping can also output stats in a variety of formats for putting into time series databases. Currently only Graphite and StatsD are supported:

$ nfsping -c 5 -oG filer1
nfs.filer1.ping.usec 401 1370501562
nfs.filer1.ping.usec 416 1370501563
nfs.filer1.ping.usec 403 1370501564
nfs.filer1.ping.usec 410 1370501565
nfs.filer1.ping.usec 399 1370501566

This is the Graphite plaintext protocol which is <path> <metric> <timestamp>. To avoid floating point numbers, nfsping reports the response time in microseconds.

This output can be easily redirected to a Carbon server using nc (netcat):

$ nfsping -l -t1000 -oG filer1 filer2 | nc carbon1 2003

This will send a result every second. Because nfsping is single threaded it's best to set a timeout to stop unresponsive NFS servers from affecting the results when specifying multiple targets. Lost requests will be reported under a separate path, nfs.$target.ping.lost, with a metric of 1.

nc will exit if the TCP connection is reset (such as if the Carbon server is restarted) which will also cause nfsping to exit with a broken pipe. To retry, start a persistent process under something like daemontools or runit which will restart the process if it exits for any reason.

Similarly, the StatsD output will produce plaintext output suitable for sending to StatsD with netcat:

$ nfsping -c 5 -oS filer1
nfsping.filer1.ping:0.15|ms
nfsping.filer1.ping:0.18|ms
nfsping.filer1.ping:3.27|ms
nfsping.filer1.ping:11.61|ms
nfsping.filer1.ping:78.07|ms

Note that this output uses floating point values, as the StatsD protocol only supports milliseconds. While floating point values are supposedly supported by the protocol, some implementations of StatsD may not support them. This output has been tested as working with statsite: https://github.com/armon/statsite


This is a list of the available command-line options:

Usage: nfsping [options] [targets...]
    -A         show IP addresses
    -c n       count of pings to send to target
    -C n       same as -c, output parseable format
    -d         reverse DNS lookups for targets
    -D         print timestamp (unix time) before each line
    -g string  prefix for graphite metric names (default "nfs")
    -i n       interval between targets (in ms, default 25)
    -l         loop forever
    -m         use multiple target IP addresses if found
    -M         use the portmapper (default: NFS no, mount yes)
    -n         check the mount protocol (default NFS)
    -o         output format ([G]raphite, [S]tatsd, Open[T]sdb, default human readable)
    -p n       pause between pings to target (in ms, default 1000)
    -P n       specify port (default: NFS 2049)
    -q         quiet, only print summary
    -r n       reconnect to server every n pings
    -S addr    set source address
    -t n       timeout (in ms, default 2500)
    -T         use TCP (default UDP)
    -V n       specify NFS version (2 or 3, default 3)

NFSping will handle multiple targets and iterate through them on each round. If there are multiple DNS responses for a target, only the first is used. All of them can be checked by using the -m option. The sleep interval between targets can be controlled with the -i option. If any of the targets fail to respond, the command will exit with a status code of 1.