kastenhq/kubestr

Can`t output results of FIO test for rook-ceph-block storage

vkamlov opened this issue · 21 comments

Info:
Version kubestr - any
Workstation where run kubestr - macos big sur 11.2.3/Debian 10 (buster)
shell - bash/zsh

Scenario:

$ ./kubestr

Kubernetes Version Check:
Valid kubernetes version (v1.16.15) - OK

RBAC Check:
Kubernetes RBAC is enabled - OK

Aggregated Layer Check:
The Kubernetes Aggregated Layer is enabled - OK

Available Storage Provisioners:

rook-ceph.rbd.csi.ceph.com:
Cluster is not CSI snapshot capable. Requires VolumeSnapshotDataSource feature gate.
This is a CSI driver!
(The following info may not be up to date. Please check with the provider for more information.)
Provider: Ceph RBD
Website: https://github.com/ceph/ceph-csi
Description: A Container Storage Interface (CSI) Driver for Ceph RBD
Additional Features: Raw Block, Snapshot, Expansion, Topology, Cloning

Storage Classes:
  * rook-ceph-block

To perform a FIO test, run-
  ./kubestr fio -s <storage class>

This provisioner supports snapshots, however no Volume Snaphsot Classes were found.

$ ./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-xvbxj
Pod created kubestr-fio-pod-vkzb8
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
Elapsed time- 50.430796167s
FIO test results:
Failed while running FIO test.: Unable to parse fio output into json.: unexpected end of JSON input - Error

Hi @vkamlov,
It's unfortunate that you ran into this issue. Can you help me debug it?
You will have to

  • clone the repo
  • git checkout debug-74
  • go build
  • ./kubestr fio -s rook-ceph-block

I added some debugging that should print the original string. We can figure out why it's failing to parse it.

Output is same. There is no debug information in this build

Except for empty brackets

./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-wmrwr
Pod created kubestr-fio-pod-mzdft
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
Elapsed time- 50.379957197s
FIO test results:
Failed while running FIO test.: Unable to parse fio output into json. (): unexpected end of JSON input - Error

Can you try again ? I added additional debugging. It looks like the FIO results are coming up empty. Which is why it failed to parse.

./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-ljvhz
Pod created kubestr-fio-pod-z4f8h
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
\ panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x20941c3]

goroutine 54 [running]:
github.com/kastenhq/kubestr/pkg/fio.(*fioStepper).runFIOCommand.func1(0xc00022cba0, 0x23eeb42, 0x7, 0xc000043170, 0x15, 0x23f2990, 0xb, 0xc0000d22d0, 0x5, 0x5, ...)
/tmp/kubestr/pkg/fio/fio.go:315 +0x143
created by github.com/kastenhq/kubestr/pkg/fio.(*fioStepper).runFIOCommand
/tmp/kubestr/pkg/fio/fio.go:313 +0x325

Heh that made things worse. can we try again?

./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-c59ts
Pod created kubestr-fio-pod-8zngt
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
\ stdout-
stderr-
Elapsed time- 50.323057437s
FIO test results:
Failed while running FIO test.: Unable to parse fio output into json. (): unexpected end of JSON input - Error

Any updates?

Hey @vkamlov, sorry for the delay.
Would you be free to jump on a call?

If not can you try deploying the following pod and pvc -

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: kubestr-pvc
spec:
  strorageClassName: rook-ceph-block
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: kubestr-pvc
  containers:
    - name: kubestr
      image: ghcr.io/kastenhq/kubestr:latest
      command: [ "/bin/sh" ]
      args: [ "-c", "tail -f /dev/null" ]
      volumeMounts:
        - mountPath: "/data"
          name: data

After doing that we can run kubectl exec kubestr -i -t -- sh
This will let us enter the pod where we can try running FIO commands. For some reason we are not seeing the output of the commands (they are coming up empty, no error either)

I deployed the container. Did some tests what I found in the official fio documentation.
Here is the output I got. Everything looks good.

$ kubectl exec -it task-pv-pod -- sh
/ # 
/ # 
/ # fio --name=global --rw=randread --size=128m --name=job1 --name=job2
job1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
job2: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.20
Starting 2 processes
job1: Laying out IO file (1 file / 128MiB)
job2: Laying out IO file (1 file / 128MiB)
Jobs: 2 (f=2): [r(2)][100.0%][r=20.0MiB/s][r=5367 IOPS][eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=16: Wed Jun 30 10:02:02 2021
  read: IOPS=2095, BW=8383KiB/s (8584kB/s)(128MiB/15635msec)
    clat (usec): min=137, max=30867, avg=439.46, stdev=487.62
     lat (usec): min=138, max=30877, avg=447.87, stdev=488.75
    clat percentiles (usec):
     |  1.00th=[  167],  5.00th=[  196], 10.00th=[  221], 20.00th=[  253],
     | 30.00th=[  281], 40.00th=[  306], 50.00th=[  330], 60.00th=[  363],
     | 70.00th=[  412], 80.00th=[  515], 90.00th=[  734], 95.00th=[ 1004],
     | 99.00th=[ 1909], 99.50th=[ 2606], 99.90th=[ 5407], 99.95th=[ 7963],
     | 99.99th=[20055]
   bw (  KiB/s): min= 6027, max=12167, per=50.74%, avg=8410.81, stdev=1609.35, samples=31
   iops        : min= 1506, max= 3041, avg=2102.42, stdev=402.32, samples=31
  lat (usec)   : 250=18.79%, 500=60.23%, 750=11.42%, 1000=4.55%
  lat (msec)   : 2=4.09%, 4=0.73%, 10=0.17%, 20=0.01%, 50=0.01%
  cpu          : usr=6.06%, sys=22.69%, ctx=33198, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
job2: (groupid=0, jobs=1): err= 0: pid=17: Wed Jun 30 10:02:02 2021
  read: IOPS=2072, BW=8289KiB/s (8488kB/s)(128MiB/15813msec)
    clat (usec): min=133, max=60526, avg=444.96, stdev=599.91
     lat (usec): min=134, max=60530, avg=453.34, stdev=600.95
    clat percentiles (usec):
     |  1.00th=[  167],  5.00th=[  198], 10.00th=[  221], 20.00th=[  255],
     | 30.00th=[  281], 40.00th=[  306], 50.00th=[  334], 60.00th=[  367],
     | 70.00th=[  416], 80.00th=[  519], 90.00th=[  742], 95.00th=[ 1012],
     | 99.00th=[ 1893], 99.50th=[ 2638], 99.90th=[ 5669], 99.95th=[ 8455],
     | 99.99th=[20317]
   bw (  KiB/s): min= 6083, max=12000, per=50.31%, avg=8339.29, stdev=1634.54, samples=31
   iops        : min= 1520, max= 3000, avg=2084.48, stdev=408.59, samples=31
  lat (usec)   : 250=18.48%, 500=60.25%, 750=11.43%, 1000=4.68%
  lat (msec)   : 2=4.26%, 4=0.71%, 10=0.16%, 20=0.02%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu          : usr=5.70%, sys=22.72%, ctx=33196, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=16.2MiB/s (16.0MB/s), 8289KiB/s-8383KiB/s (8488kB/s-8584kB/s), io=256MiB (268MB), run=15635-15813msec
/ # 
/ # 
/ # fio --name=random-writers --ioengine=libaio --iodepth=4 --rw=randwrite --bs=32k --direct=0 --size=64m --numjobs=4
random-writers: (g=0): rw=randwrite, bs=(R) 32.0KiB-32.0KiB, (W) 32.0KiB-32.0KiB, (T) 32.0KiB-32.0KiB, ioengine=libaio, iodepth=4
...
fio-3.20
Starting 4 processes
random-writers: Laying out IO file (1 file / 64MiB)
random-writers: Laying out IO file (1 file / 64MiB)
random-writers: Laying out IO file (1 file / 64MiB)
random-writers: Laying out IO file (1 file / 64MiB)

random-writers: (groupid=0, jobs=1): err= 0: pid=20: Wed Jun 30 10:02:21 2021
  write: IOPS=8031, BW=251MiB/s (263MB/s)(64.0MiB/255msec); 0 zone resets
    slat (usec): min=36, max=238, avg=77.43, stdev=26.44
    clat (usec): min=18, max=1042, avg=385.44, stdev=112.34
     lat (usec): min=78, max=1186, avg=470.35, stdev=134.32
    clat percentiles (usec):
     |  1.00th=[  212],  5.00th=[  221], 10.00th=[  225], 20.00th=[  245],
     | 30.00th=[  338], 40.00th=[  371], 50.00th=[  412], 60.00th=[  424],
     | 70.00th=[  437], 80.00th=[  457], 90.00th=[  515], 95.00th=[  570],
     | 99.00th=[  660], 99.50th=[  717], 99.90th=[  906], 99.95th=[  988],
     | 99.99th=[ 1045]
  lat (usec)   : 20=0.05%, 250=20.46%, 500=67.24%, 750=11.91%, 1000=0.29%
  lat (msec)   : 2=0.05%
  cpu          : usr=27.56%, sys=70.47%, ctx=8, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4
random-writers: (groupid=0, jobs=1): err= 0: pid=21: Wed Jun 30 10:02:21 2021
  write: IOPS=7876, BW=246MiB/s (258MB/s)(64.0MiB/260msec); 0 zone resets
    slat (usec): min=33, max=489, avg=77.95, stdev=32.08
    clat (usec): min=23, max=1769, avg=394.50, stdev=151.30
     lat (usec): min=92, max=1889, avg=480.46, stdev=177.47
    clat percentiles (usec):
     |  1.00th=[  204],  5.00th=[  210], 10.00th=[  215], 20.00th=[  227],
     | 30.00th=[  285], 40.00th=[  400], 50.00th=[  412], 60.00th=[  420],
     | 70.00th=[  433], 80.00th=[  486], 90.00th=[  570], 95.00th=[  635],
     | 99.00th=[  742], 99.50th=[  832], 99.90th=[ 1680], 99.95th=[ 1713],
     | 99.99th=[ 1778]
  lat (usec)   : 50=0.05%, 250=24.71%, 500=56.69%, 750=17.63%, 1000=0.49%
  lat (msec)   : 2=0.44%
  cpu          : usr=41.31%, sys=57.14%, ctx=9, majf=0, minf=13
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4
random-writers: (groupid=0, jobs=1): err= 0: pid=22: Wed Jun 30 10:02:21 2021
  write: IOPS=7846, BW=245MiB/s (257MB/s)(64.0MiB/261msec); 0 zone resets
    slat (usec): min=32, max=637, avg=79.19, stdev=36.53
    clat (usec): min=17, max=2185, avg=394.94, stdev=151.92
     lat (usec): min=63, max=2368, avg=481.59, stdev=178.84
    clat percentiles (usec):
     |  1.00th=[  206],  5.00th=[  212], 10.00th=[  217], 20.00th=[  225],
     | 30.00th=[  293], 40.00th=[  408], 50.00th=[  416], 60.00th=[  424],
     | 70.00th=[  437], 80.00th=[  469], 90.00th=[  562], 95.00th=[  652],
     | 99.00th=[  783], 99.50th=[  832], 99.90th=[ 2057], 99.95th=[ 2114],
     | 99.99th=[ 2180]
  lat (usec)   : 20=0.05%, 100=0.05%, 250=27.05%, 500=57.91%, 750=13.53%
  lat (usec)   : 1000=1.27%
  lat (msec)   : 4=0.15%
  cpu          : usr=27.31%, sys=71.15%, ctx=25, majf=0, minf=13
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4
random-writers: (groupid=0, jobs=1): err= 0: pid=23: Wed Jun 30 10:02:21 2021
  write: IOPS=8000, BW=250MiB/s (262MB/s)(64.0MiB/256msec); 0 zone resets
    slat (usec): min=32, max=505, avg=77.38, stdev=31.58
    clat (usec): min=17, max=975, avg=386.85, stdev=128.47
     lat (usec): min=70, max=1132, avg=471.73, stdev=155.17
    clat percentiles (usec):
     |  1.00th=[  204],  5.00th=[  212], 10.00th=[  217], 20.00th=[  225],
     | 30.00th=[  330], 40.00th=[  396], 50.00th=[  412], 60.00th=[  420],
     | 70.00th=[  429], 80.00th=[  461], 90.00th=[  553], 95.00th=[  619],
     | 99.00th=[  717], 99.50th=[  766], 99.90th=[  873], 99.95th=[  922],
     | 99.99th=[  979]
  lat (usec)   : 20=0.05%, 100=0.05%, 250=24.61%, 500=59.72%, 750=14.89%
  lat (usec)   : 1000=0.68%
  cpu          : usr=29.02%, sys=70.59%, ctx=7, majf=0, minf=13
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
  WRITE: bw=981MiB/s (1028MB/s), 245MiB/s-251MiB/s (257MB/s-263MB/s), io=256MiB (268MB), run=255-261msec

So you have to run it against the directory
fio --directory /data .....

/ # fio --directory /data --name=global --rw=randread --size=128m --name=job1 --name=job2
job1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
job2: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.20
Starting 2 processes
job1: Laying out IO file (1 file / 128MiB)
job2: Laying out IO file (1 file / 128MiB)
Jobs: 1 (f=1): [r(1),_(1)][100.0%][r=652KiB/s][r=163 IOPS][eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=15: Wed Jun 30 22:19:46 2021
  read: IOPS=64, BW=259KiB/s (266kB/s)(128MiB/505352msec)
    clat (usec): min=1536, max=424419, avg=15373.01, stdev=18983.77
     lat (usec): min=1540, max=424434, avg=15383.14, stdev=18984.11
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    5], 40.00th=[    5], 50.00th=[    6], 60.00th=[   10],
     | 70.00th=[   18], 80.00th=[   26], 90.00th=[   39], 95.00th=[   55],
     | 99.00th=[   85], 99.50th=[   96], 99.90th=[  157], 99.95th=[  197],
     | 99.99th=[  259]
   bw (  KiB/s): min=   32, max= 1016, per=49.85%, avg=258.23, stdev=153.63, samples=983
   iops        : min=    8, max=  254, avg=64.38, stdev=38.42, samples=983
  lat (msec)   : 2=0.27%, 4=22.05%, 10=38.07%, 20=13.59%, 50=19.86%
  lat (msec)   : 100=5.75%, 250=0.40%, 500=0.02%
  cpu          : usr=0.28%, sys=0.85%, ctx=33292, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
job2: (groupid=0, jobs=1): err= 0: pid=16: Wed Jun 30 22:19:46 2021
  read: IOPS=74, BW=300KiB/s (307kB/s)(128MiB/437554msec)
    clat (usec): min=1412, max=409529, avg=13305.23, stdev=17784.49
     lat (usec): min=1416, max=409542, avg=13315.49, stdev=17784.84
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    5], 40.00th=[    5], 50.00th=[    6], 60.00th=[    7],
     | 70.00th=[   12], 80.00th=[   22], 90.00th=[   36], 95.00th=[   52],
     | 99.00th=[   80], 99.50th=[   93], 99.90th=[  140], 99.95th=[  169],
     | 99.99th=[  268]
   bw (  KiB/s): min=   40, max= 1085, per=57.82%, avg=299.49, stdev=187.15, samples=873
   iops        : min=   10, max=  271, avg=74.71, stdev=46.79, samples=873
  lat (msec)   : 2=0.53%, 4=26.84%, 10=41.32%, 20=10.21%, 50=15.90%
  lat (msec)   : 100=4.86%, 250=0.32%, 500=0.02%
  cpu          : usr=0.32%, sys=0.95%, ctx=33219, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=519KiB/s (531kB/s), 259KiB/s-300KiB/s (266kB/s-307kB/s), io=256MiB (268MB), run=437554-505352msec

Disk stats (read/write):
  rbd3: ios=65563/448, merge=0/25, ticks=927008/10082, in_queue=935584, util=99.62%

I also run all test from source file fio_jobs.go. Results looks good like this last test:

/ # fio --directory /data --name=rand_readwrite --bs=4K --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 --time_based --ramp_time=2s --runtime=15s
rand_readwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=64
fio-3.20
Starting 1 process
Jobs: 1 (f=1): [m(1)][0.0%][eta 30d:08h:10m:20s]                           
rand_readwrite: (groupid=0, jobs=1): err= 0: pid=62: Wed Jun 30 22:42:34 2021
  read: IOPS=0, BW=1264B/s (1264B/s)(24.0KiB/19430msec)
    clat (nsec): min=27778, max=59981, avg=44527.67, stdev=15340.15
     lat (nsec): min=35095, max=69363, avg=52516.33, stdev=15882.41
    clat percentiles (nsec):
     |  1.00th=[27776],  5.00th=[27776], 10.00th=[27776], 20.00th=[30848],
     | 30.00th=[30848], 40.00th=[33536], 50.00th=[33536], 60.00th=[57088],
     | 70.00th=[58112], 80.00th=[58112], 90.00th=[60160], 95.00th=[60160],
     | 99.00th=[60160], 99.50th=[60160], 99.90th=[60160], 99.95th=[60160],
     | 99.99th=[60160]
   bw (  KiB/s): min=    7, max=   24, per=100.00%, avg=15.67, stdev= 8.50, samples=3
   iops        : min=    1, max=    6, avg= 3.67, stdev= 2.52, samples=3
  write: IOPS=0, BW=632B/s (632B/s)(12.0KiB/19430msec); 0 zone resets
    clat (msec): min=1430, max=14528, avg=6475.82, stdev=7048.14
     lat (msec): min=1430, max=14528, avg=6475.82, stdev=7048.14
    clat percentiles (msec):
     |  1.00th=[ 1435],  5.00th=[ 1435], 10.00th=[ 1435], 20.00th=[ 1435],
     | 30.00th=[ 1435], 40.00th=[ 3473], 50.00th=[ 3473], 60.00th=[ 3473],
     | 70.00th=[14563], 80.00th=[14563], 90.00th=[14563], 95.00th=[14563],
     | 99.00th=[14563], 99.50th=[14563], 99.90th=[14563], 99.95th=[14563],
     | 99.99th=[14563]
   bw (  KiB/s): min=    7, max=    8, per=100.00%, avg= 7.50, stdev= 0.71, samples=2
   iops        : min=    1, max=    2, avg= 1.50, stdev= 0.71, samples=2
  lat (usec)   : 50=33.33%, 100=33.33%
  lat (msec)   : 2000=11.11%, >=2000=22.22%
  cpu          : usr=0.00%, sys=0.04%, ctx=106, majf=0, minf=58
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=6,3,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=1264B/s (1264B/s), 1264B/s-1264B/s (1264B/s-1264B/s), io=24.0KiB (24.6kB), run=19430-19430msec
  WRITE: bw=632B/s (632B/s), 632B/s-632B/s (632B/s-632B/s), io=12.0KiB (12.3kB), run=19430-19430msec

Disk stats (read/write):
  rbd3: ios=0/2254, merge=0/26, ticks=0/52128, in_queue=52648, util=99.74%

Only this string looks strange, but I dont now if it matters
Jobs: 1 (f=1): [m(1)][0.0%][eta 30d:08h:10m:20s]

Hmm, the last thing thats missing is the --output-format=json option. Otherwise this issue is very bizarre. The command the runs the FIO is stdout, stderr, err = s.kubeExecutor.exec(namespace, podName, containerName, command). stdout, stderr and err are all coming up empty.
Have you tried this on other clusters or other storage classes? I'm unable to reproduce this.

I tried on a cluster created with kubeadm k8s v1.15 and rancher k8s v1.15-1.21
I tried rook-ceph-block and portworx
This cluster is temp. I am constantly redeploying it for tests.
All looks good on aws gp2 storage class

Which is why this is perplexing. From within the pod it should seem like the same image with a volume attached at /data.

Earlier you offered a call. Where can we try?

@bathina2 & @vkamlov -- Interestingly, I've run into the same problem with v0.4.18 using ceph-block storage with Rook. The FIO tests run and results display successfully when I use Kubestr v0.4.17.

I am running kubestr from CentOS 7 VM, and with a Kubernetes Cluster v1.21.3. I have Rook v1.7 deployed with Block Storage.

Alright, I spoke too soon. I started changing some parameters, and I ran into issues once I've adjusted the runtime parameter on a seq-read fio test:

The FIO test runs without issues with this setting:

[global]
name=fio-seq-reads
filename=fio-seq-reads
rw=read
bs=256K
direct=1
numjobs=1
time_based
**runtime=120**

[file1]
size=10G
ioengine=libaio
iodepth=16

However, the following causes the error condition when job finishes:

[global]
name=fio-seq-reads
filename=fio-seq-reads
rw=read
bs=256K
direct=1
numjobs=1
time_based
**runtime=240**

[file1]
size=10G
ioengine=libaio
iodepth=16

Result:

Running FIO test (fio-seq-read.fio) on StorageClass (ceph-block) with a PVC of Size (100Gi)
Elapsed time- 5m0.034013586s
FIO test results:
  Failed while running FIO test.: Unable to parse fio output into json.: unexpected end of JSON input  -  Error