powturbo/TurboPFor-Integer-Compression

Benchmark: Lossless/Lossy Floating Point Compression. TurboPFor vs zfp & blosc

powturbo opened this issue · 0 comments

  • 2D/3D datasets:
    +Float Compression dataset : description
    +Datasets for Benchmarking Floating-Point Compressors

  • Lossless floating point compression:
    icapp sq1024x1024x4.f32 -R0 -Ff -I15 -J15 -e105,143,80,102 -Ezstd,15
    option -R0 = automatic dimensions determination from file name

  • Lossy compression with point wise relative error bound:
    icapp sq1024x1024x4.f32 -R0 -Ff -I15 -J15 -e105 -g.001 -Ezstd,15

  • Lossy compression with zfp:
    icapp sq1024x1024x4.f32 -R0 -Ff -I15 -J15 -e142 -z.001
    option -z: for compressors with builin lossy error bound (ex. zfp)

E MB/s     size   ratio   D MB/s  function floating point size=32 bits (lz=zstd,15) unsorted -1
   60    2327663  13.87%     553  105:LztpzD3byte      3D TransposeZ+zstd,15    sq1024x1024x4.f32
  230   16014760  95.46%     237  143:zfp              zfp 3D                   sq1024x1024x4.f32 
   58    2213925  13.20%     585  105:LztpzD3byte      3D TransposeZ+zstd,15    sq1024x1024x4.f32 -g.001
  225   14089880  83.98%     258  142:zfp              zfp 3D                   sq1024x1024x4.f32 -z.001
 
   16   14290026  21.29%     230  105:LztpzD3byte      3D TransposeZ+zstd,15    sq2048x2048x4.f32
  209   68256528 101.71%     209  142:zfp              zfp 3D                   sq2048x2048x4.f32
   18    7277397  10.84%     225  104:LztpxD3byte      3D TransposeX+zstd,15    sq2048x2048x4.f32 -g.001
   14    7550698  11.25%     222  105:LztpzD3byte      3D TransposeZ+zstd,15    sq2048x2048x4.f32 -g.001
  246   47138168  70.24%     293  142:zfp              zfp 3D                   sq2048x2048x4.f32 -z.001 
  
   13     672884  18.08%    2159   80:Lz               zstd,15                  float232630x4.f32
   10    2100247  56.43%    1037  102:LztpzD2byte      2D TransposeZ+zstd,15    float232630x4.f32
  214    3774768 101.42%     151  141:zfp              zfp 2D                   float232630x4.f32
   10     414119  11.13%    2506   80:Lz               zstd,15                  float232630x4.f32 -g.001
    8    1287500  34.59%    1245  102:LztpzD2byte      2D TransposeZ+zstd,15    float232630x4.f32 -g.001
  377    1686848  45.32%     303  141:zfp              zfp 2D                   float232630x4.f32 -z.001
  
   12    5821713  50.90%    1358  102:LztpzD2byte      2D TransposeZ+zstd,15    float953134x3.f32
  124   14742232 128.89%     127  141:zfp              zfp 2D                   float953134x3.f32
   10    4517811  39.50%    1205  102:LztpzD2byte      2D TransposeZ+zstd,15    float953134x3.f32 -g.001
  195    9544168  83.45%     119  140:zfp              zfp                      float953134x3.f32 -z.001
  223   12528088 109.53%     150  141:zfp              zfp 2D                   float953134x3.f32 -z.001

  331    5041816   1.88%    5087  90:lztprle           Transpose +rle+zstd,15   astro_mhd128x512x1024.f32
   90    5855745   2.18%     182  103:LztpD3byte       3D Transpose +zstd,15    astro_mhd128x512x1024.f32
  557   37367816  13.92%     829  141:zfp              zfp 2D                   astro_mhd128x512x1024.f32
  469   81566104  30.39%     598  142:zfp              zfp 3D                   astro_mhd128x512x1024.f32   
  379    2550304   0.95%    5217   90:lztprle          Transpose +rle+zstd,15   astro_mhd128x512x1024.f32 -g.001
   98    3011165   1.12%     154  103:LztpD3byte       3D Transpose +zstd,15    astro_mhd128x512x1024.f32 -g.001
  424   35967416  13.40%     810  141:zfp              zfp 2D                   astro_mhd128x512x1024.f32 -z.001
  320   53603088  19.97%     717  142:zfp              zfp 3D                   astro_mhd128x512x1024.f32 -z.001

   17  137024507  40.84%    2453   86:Lztp4z Nibble    TransposeZ +zstd,15      astro_pt512x256x640.f32
   14  140254161  41.80%     182  105:LztpzD3byte      3D TransposeZ+zstd,15    astro_pt512x256x640.f32
    9   34393724  10.25%     181  105:LztpzD3byte      3D TransposeZ+zstd,15    astro_pt512x256x640.f32 -g.001
   16   40208355  11.98%    2487   86:Lztp4z Nibble    TransposeZ +zstd,15      astro_pt512x256x640.f32 -g.001
  317  148181952  44.16%     405  142:zfp              zfp 3D                   astro_pt512x256x640.f32 -z.001
  211  322757072  96.19%     224  142:zfp              zfp 3D                   astro_pt512x256x640.f32 -z.001

   19   36640070  38.86%     248  102:LztpzD2byte      2D TransposeZ+zstd,15    rsim2048x11509.f32
  284   71839424  76.20%     214  141:zfp              zfp 2D                   rsim2048x11509.f32
   10    8529441   9.05%     243  102:LztpzD2byte      2D TransposeZ+zstd,15    rsim2048x11509.f32 -g.001
  945    8888536   9.43%    1074  141:zfp              zfp 2D                   rsim2048x11509.f32 -z.001

   11  141026747  26.27%     213 105:LztpzD3byte       3D TransposeZ+zstd,15    wave512x512x512.f32
  236  167737848  31.24%     476 142:zfp               zfp 3D                   wave512x512x512.f32
    1   12370337   2.30%     113 105:LztpzD3byte       3D TransposeZ+zstd,22    wave512x512x512.f32 -g.001
  578   13156920   2.45%    1717 142:zfp               zfp 3D                   wave512x512x512.f32 -z.001
   13   28570801   5.32%     224 105:LztpzD3byte       3D TransposeZ+zstd,15    wave512x512x512.f32 -g.001

TurboPFor floating point compression is 31,6% vs 49,3% for blosc2 using byte transpose/shuffle + delta.
Blosc with the new bytedelta compression is 35,7% and still inferior. Additinally TurboPFor decompression ~35% faster.
Using lz4 instead of zstd we have 36,1% witch has similar compression ratio as the new blosc-bytedelta, but is a lot faster than blosc2 with zstd,15.
Lossy compression is only 9,7% with point wise error bound 0.001, Blosc has nothing similar.

icapp *.f32 -I16 -J16 -Ezstd,15 -e83 -b4m
E MB/s     size   ratio   D MB/s  function integer size=32 bits (lz=zstd,15) 
  113    7071465   2.63%    7012  83:Lztpz  Byte   TransposeZ +zstd,15    astro_mhd128x512x1024.f32
   23  146977334  43.80%    3510  83:Lztpz  Byte   TransposeZ +zstd,15    astro_pt512x256x640.f32
   14    1636009  43.95%    1401  83:Lztpz  Byte   TransposeZ +zstd,15    float232630x4.f32
   20    8037995  70.28%    2989  83:Lztpz  Byte   TransposeZ +zstd,15    float953134x3.f32
   70   12834243   9.20%    5102  83:Lztpz  Byte   TransposeZ +zstd,15    msg_sppm.f32
   26    6321188  10.06%    6057  83:Lztpz  Byte   TransposeZ +zstd,15    msg_sweep3d.f32
   30   47691738  50.58%    3379  83:Lztpz  Byte   TransposeZ +zstd,15    rsim2048x11509.f32
   14    3780571  22.53%    2273  83:Lztpz  Byte   TransposeZ +zstd,15    sq1024x1024x4.f32
   15   23721299  35.35%    2280  83:Lztpz  Byte   TransposeZ +zstd,15    sq2048x2048x4.f32
   24  228029425  42.47%    3498  83:Lztpz  Byte   TransposeZ +zstd,15    wave512x512x512.f32
Tot  486,101,266  31,6%

icapp *.f32 -I16 -J16 -Elz4,9 -b4m -e83
   67    8705415   3.24%    8265  83:Lztpz  Byte  TransposeZ +lz4,9       astro_mhd128x512x1024.f32
   42  164889677  49.14%    5321  83:Lztpz  Byte  TransposeZ +lz4,9       astro_pt512x256x640.f32
   28    1908166  51.27%    1856  83:Lztpz  Byte  TransposeZ +lz4,9       float232630x4.f32
   51    8428180  73.69%    2856  83:Lztpz  Byte  TransposeZ +lz4,9       float953134x3.f32
  195   15746064  11.29%    5813  83:Lztpz  Byte  TransposeZ +lz4,9       msg_sppm.f32
   66   19587682  31.16%    6285  83:Lztpz  Byte  TransposeZ +lz4,9       msg_sweep3d.f32
   83   50008850  53.04%    5955  83:Lztpz  Byte  TransposeZ +lz4,9       rsim2048x11509.f32
   23    4651669  27.73%    4283  83:Lztpz  Byte  TransposeZ +lz4,9       sq1024x1024x4.f32
   36   25760593  38.39%    4096  83:Lztpz  Byte  TransposeZ +lz4,9       sq2048x2048x4.f32
   50  254984059  47.49%    5137  83:Lztpz  Byte  TransposeZ +lz4,9       wave512x512x512.f32
Tot  554,670,354  36,10%

icapp *.f32 -I16 -J16 -Ezstd,15 -e147
  266   18371070   6.84%    5689 147:blosc  shuffle delta+zstd,15         astro_mhd128x512x1024.f32
   92  231861071  69.10%    3860 147:blosc  shuffle delta+zstd,15         astro_pt512x256x640.f32
   34    2475736  66.51%     950 147:blosc  shuffle delta+zstd,15         float232630x4.f32
   51    8531654  74.59%    2854 147:blosc  shuffle delta+zstd,15         float953134x3.f32
  295   14362282  10.30%    4440 147:blosc  shuffle delta+zstd,15         msg_sppm.f32
   60   33739276  53.67%    4097 147:blosc  shuffle delta+zstd,15         msg_sweep3d.f32
   81   50291192  53.34%    4443 147:blosc  shuffle delta+zstd,15         rsim2048x11509.f32
   24    3278693  19.54%    1826 147:blosc  shuffle delta+zstd,15         sq1024x1024x4.f32
   71   45029315  67.10%    1418 147:blosc  shuffle delta+zstd,15         sq2048x2048x4.f32
   69  349817736  65.16%    3976 147:blosc  shuffle delta+zstd,15         wave512x512x512.f32
Tot  757,758,024  49,32%

icapp -e148 \x\c\fp\*.f32 -I16 -J16 -Ezstd,15
  390    7221944   2.69%    5262 148:blosc  shuffle+bytedelta+zstd,15     astro_mhd128x512x1024.f32
   35  167134975  49.81%    2515 148:blosc  shuffle+bytedelta+zstd,15     astro_pt512x256x640.f32
   26    1688155  45.36%    1165 148:blosc  shuffle+bytedelta+zstd,15     float232630x4.f32
   54    7720069  67.50%    2424 148:blosc  shuffle+bytedelta+zstd,15     float953134x3.f32
  276   14293184  10.25%    4165 148:blosc  shuffle+bytedelta+zstd,15     msg_sppm.f32
   51   16267728  25.88%    3972 148:blosc  shuffle+bytedelta+zstd,15     msg_sweep3d.f32
   70   48424868  51.36%    2701 148:blosc  shuffle+bytedelta+zstd,15     rsim2048x11509.f32
   26    4018441  23.95%    1964 148:blosc  shuffle+bytedelta+zstd,15     sq1024x1024x4.f32
   25   22167842  33.03%    2178 148:blosc  shuffle+bytedelta+zstd,15     sq2048x2048x4.f32
   38  259533898  48.34%    2859 148:blosc  shuffle+bytedelta+zstd,15     wave512x512x512.f32
Tot    548471103  35.70%

Lossy compression with pointwise relative error PWE = 0.001

icapp *.f32 -Ezstd,15 -I0 -J15 -e83 -b4m -Ff -v0 -g.001
  113    4500243   1.68%    6406  83:Lztpz  Byte      TransposeZ +zstd,15  astro_mhd128x512x1024.f32
   18   38733567  11.54%    2645  83:Lztpz  Byte      TransposeZ +zstd,15  astro_pt512x256x640.f32
    9     988371  26.55%    1516  83:Lztpz  Byte      TransposeZ +zstd,15  float232630x4.f32
   17    6663777  58.26%    1853  83:Lztpz  Byte      TransposeZ +zstd,15  float953134x3.f32
   50    7769060   5.57%    4192  83:Lztpz  Byte      TransposeZ +zstd,15  msg_sppm.f32
   17    1693751   2.69%    5780  83:Lztpz  Byte      TransposeZ +zstd,15  msg_sweep3d.f32
   22   20582797  21.83%    2124  83:Lztpz  Byte      TransposeZ +zstd,15  rsim2048x11509.f32
   13    3474495  20.71%    2158  83:Lztpz  Byte      TransposeZ +zstd,15  sq1024x1024x4.f32
   16   15005005  22.36%    2022  83:Lztpz  Byte      TransposeZ +zstd,15  sq2048x2048x4.f32
   16   50267079   9.36%    2777  83:Lztpz  Byte      TransposeZ +zstd,15  wave512x512x512.f32
Total 149,678,144  9,74%