DeepPerf

Paper Link: Optimizing non-decomposable measures with deep networks

For running DUPLE, DAME, DENIM and Struct-ANN

Go into the deep_non_decomp_src folder to see the code.

Code for running DUPLE, DENIM, DAME and Struct ANN

I apologize in advance for this code being weirdly inconsistent in several ways. I have edited this code over a long period of time with significant breaks in between, which I blame for this inconsistency.

The following address are relative to the deep_non_decomp_src folder.

All the data is in the datasets folder and is read through the wrapper in datasets/dataRead.py.

Concave measures and Benchmark

To run DUPLE

Ensure that the variable dual_class in Line 15 is set to one of the classes in DeeSpade.dual_step.
Ensure that the variable model in Line 22 is set to Spade.
and then run

python train_batch_opt.py [dataset]

The score is accumulated in line 72 and 73.
Use lines 96 and 97 to save it to file.

To run p-Benchmark (ANN-p)

The variable dual_class is inconsquential.
Ensure that the variable model in Line 22 is set to BenchANN.
and then run

python train_batch_opt.py [dataset]

All the scores are accumulated in minC in Line 71.
Save them through 98.

To run Benchmark (ANN-0-1)

You will have to do a trivial change in the BenchANN file to get rid of the p-sensitive cost function to get the true cost. To do this comment Line 40 in DeeSpade/bench.py and uncomment Line 42```
The variable dual_class is inconsquential.
Ensure that the variable model in Line 22 is set to BenchANN.
and then run

python train_batch_opt.py [dataset]

All the scores are accumulated in minC in Line 71.
Comment Line 72 and Line 73.
Extract the different scores from Line 76 to 80.
Save them through Line 99 - 102.

Pseudolinear Measures

To run DAME

Fbeta score is the only score we see here. The code for that is in DAMP.ANNAMP/FbetaANN.
Run python ANNAMPTrain.py [dataset]
The scores are stored in [dataset]ANNAMAP_FMeas_new.npz

To run ANN-PG

The code is in DAMP.AMP.FbetaThresh.
Run python AMPTrain.py [dataset]#
The scores are stored in [dataset]AMP_PG.npz

Nested Concave Measures

Here we only look at NegKLD. The code is in DAMP.AMP.FbetaThresh and the primal and dual step are in demesis.concave_fn.KLD.
Run python train_denembis_kld.py [dataset]
The score is stored in [dataset]_kld_rew.npz

Some files also calculate BAKLD but they can be ignored

Struct ANN file

The MVC code is present in all_struct/c_code/mvc.c and the shared library is already compiled in the folder as libmvc.so.
This is then used by the network definition and training algorithm which is present in all_struct/struct_ann.py and the final training wrapper is train_batch_struct.py.
Ignoring the details, to train run the command python train_batch_struct.py [dataset] [loss_fn] where the [dataset] variable is as usual and the variable [loss_fn] is defined in all_struct/loss_functions.py```. We only use minTPRTNRandfone` among those.

Plotting the File

Run the necessary training files to obtain the score files.
Then run the necessary plot file i.e one of
- plot_[Fmeas, KLD, MinTPRTNR, QMean].py [x_axis_length]

Twitter model

The following addresses are relative to the seq2seq-attn folder.

Training the model

th train1.lua -data_file data/twit/twit-train.hdf5 -val_data_file data/twit/twit-val.hdf5 -savefile twit-model

Evaluate

th evaluate1.lua -model twit-model_final.t7 -src_file data/twit/src-val.txt -output_file pred.txt -src_dict data/twit/twit.src.dict -targ_dict data/twit/twit.targ.dict

If you use this code please cite the paper

@Article{Sansyal2018,
author="Sanyal, Amartya
and Kumar, Pawan
and Kar, Purushottam
and Chawla, Sanjay
and Sebastiani, Fabrizio",
title="Optimizing non-decomposable measures with deep networks",
journal="Machine Learning",
year="2018",
month="Sep",
day="01",
volume="107",
number="8",
pages="1597--1620",
doi="10.1007/s10994-018-5736-y",
}

purushottamkar/DeepPerf

DeepPerf

Code for running DUPLE, DENIM, DAME and Struct ANN

Concave measures and Benchmark

To run DUPLE

To run p-Benchmark (ANN-p)

To run Benchmark (ANN-0-1)

Pseudolinear Measures

To run DAME

To run ANN-PG

Nested Concave Measures

Struct ANN file

Plotting the File

Twitter model

Training the model

Evaluate