
Heuristics for the Star Discrepancy Subset Selection Problem

Primary LanguageC


This code is adapted from https://github.com/frclement/SDSSP_Heuristics for specific needs of (insert paper name).


  1. Python 3
  2. gcc and associated libaries, as well as common command line utilities (should be included in common linux distros)
  3. parallel tool on linux (only if you want to run in parallel)

Steps to reproduce:

  1. Compile shift_v2nobrute.c
gcc shift_v2nobrute.c -lm -O3
  1. Make a directory. The script will litter the directory with files and overwrite things without warning.
mkdir run-data
  1. cd to the directory. WARNING: the scripts will overwrite files in cwd without warning.
cd run-data
  1. put your df_crit.csv in the directory and run (for example). See the help of each files for usage
python3 ../extract_csv.py df_crit.csv df_crit.txt
../run.sh -j 10 ../a.out df_crit.txt 3 2188 30,60,80,90,100,110,120
../format.sh df_crit.csv 30,60,80,90,100,110,120
../split.sh -j 10 ../a.out 3 30,60,80,90,100,110,120

If you want to do the other way of splitting (i.e. choose k points the n-k remaining k points after the first iteration).

python3 ../extract_csv.py df_crit.csv df_crit.txt
../run.sh -j 10 ../a.out df_crit.txt 3 2188 30,60,80,90,100,110,120
../format.sh df_crit.csv 30,60,80,90,100,110,120 
../again.sh -j 9 ../a.out df_crit.csv 1857 30,60,80,90,100,110,120 
  1. print all the discrepancies into a csv file
echo 'which,k,discrepancy'
for value in 30 60 80 90 100 110 120
    if [[ $(head -n 1 subset_$value.txt) =~ $regex ]]
        echo "s1,${BASH_REMATCH[1]},${BASH_REMATCH[2]}"
    if [[ $(head -n 1 subset_complement_subset_$value.txt) =~ $regex ]]
        echo "s2,${BASH_REMATCH[1]},${BASH_REMATCH[2]}"