tbarbette/npf

Crash in the ZLT experimental design

Closed this issue · 3 comments

While performing a zero-loss throughput search, the processing crashes when it has no values left to try, either because the minimum value still causes loss, or because the target value in the process of getting an acceptable value occurs to be outside of the range provided by the user.

As example, with a rate defined as RATE=[1-100#1], an experimental designed defined with --exp-design "zlt(RATE,RX-GOODPUT-GBPS-PKTGEN)", and such results for a specific test case:

NDESC:256,NTHREADS:4,RATE:100,SIZE:1024,TABLE:0={RX-RATE-PPS:6970137.0,6966369.0,7057956.0},{RX-RATE-MBPS:58214.1460744,58182.6912224,58947.6194112},{LOSE-RATE:0.40676482083724,0.40717707911177,0.3993470714069},{AVG-LAT:0.0006643464506172799,0.0006664189814814801,0.0006661439043209901},{RX-GOODPUT-GBPS-PKTGEN:59.0,59.0,60.0}
NDESC:256,NTHREADS:4,RATE:59,SIZE:1024,TABLE:0={RX-RATE-PPS:6836607.0,6863517.0,6870717.0},{RX-RATE-MBPS:57098.6279176,57323.3810496,57383.4996384},{LOSE-RATE:0.010669608963823,0.0069656616030801,0.0058126685257428},{AVG-LAT:0.00011578858024691,0.00014424035493827,0.00012892361111111},{RX-GOODPUT-GBPS-PKTGEN:59.0,58.0,58.0}
NDESC:256,NTHREADS:4,RATE:57,SIZE:1024,TABLE:0={RX-RATE-PPS:6615077.0,6455461.0,6664933.0},{RX-RATE-MBPS:55248.3990632,53915.3320736,55664.7847008},{LOSE-RATE:0.0085888474008111,0.032754681330904,0.0014417885279341},{AVG-LAT:8.163117283950599e-05,0.00015808757716049,5.8790509259258994e-05},{RX-GOODPUT-GBPS-PKTGEN:57.0,55.0,57.0}
NDESC:256,NTHREADS:4,RATE:55,SIZE:1024,TABLE:0={RX-RATE-MBPS:7.76e-05,52443.03136,53690.6564448},{LOSE-RATE:0.999999984462,0.024391036299443,0.0014013166576273},{AVG-LAT:0.0,0.00016096643518518998,5.4939814814815e-05},{RX-GOODPUT-GBPS-PKTGEN:0.0,53.0,55.0},{RX-RATE-PPS:0,6279182.0,6428567.0}
NDESC:256,NTHREADS:4,RATE:16,SIZE:1024,TABLE:0={RX-RATE-PPS:1867116.0,0,1865846.0},{RX-RATE-MBPS:15593.3998472,0,15582.8079272},{LOSE-RATE:0.0012343911120845,1.0,0.0023091008216815},{AVG-LAT:4.462808641975299e-05,0.0,4.6456404320988e-05},{RX-GOODPUT-GBPS-PKTGEN:15.0,0.0,15.0}
NDESC:256,NTHREADS:4,RATE:3,SIZE:1024,TABLE:0={RX-RATE-PPS:350057.0,350135.0,350179.0},{RX-RATE-MBPS:2922.960672,2923.612128,2923.9650536},{LOSE-RATE:0.0015578834120213,0.0013718684820204,0.001360875180768},{AVG-LAT:0.0025012793209877,4.5064429012345994e-05,4.6386574074073996e-05},{RX-GOODPUT-GBPS-PKTGEN:2.0,2.0,2.0}

NPF crashes with the following traceback:

Traceback (most recent call last):
  File "(removed path to npf)/npf/npf-compare.py", line 72, in <module>
    main()
  File "(removed path to npf)/npf/npf-compare.py", line 60, in main
    series, time_series = comparator.run(test_name=args.test_files,
  File "(removed path to npf)/npf/npf/test_driver.py", line 30, in run
    build, data_dataset, time_dataset = regressor.regress_all_tests(
  File "(removed path to npf)/npf/npf/regression.py", line 198, in regress_all_tests
    all_results,time_results, init_done = test.execute_all(
  File "(removed path to npf)/npf/npf/test.py", line 1198, in execute_all
    for root_variables in all_variables:
  File "(removed path to npf)/npf/npf/expdesign/zltexp.py", line 102, in __next__
    next_val = max(filter(lambda x : x < target,left_to_try))
ValueError: max() arg is an empty sequence

I bet this is because RATE=3 leads to a GOODPUTH of 2, i.e. a drop rate of 1, leading to searching for the rate below 2-1=1, which does not exist as 1 is the lowest rate possible.

I have a fix, I'll do make a pull request when I have time

I'll push a fix, but the actual problem here is that your system is always dropping.