/ERA

Evolutionary Rank Aggregation and utilities

Primary LanguageJava

Evolutionary Rank Aggregation Algorithm

This repository contains the implementation of the Evolutionary Rank Aggregation (ERA) method proposed in

Samuel Oliveira, Victor Diniz, Anisio Lacerda, and Gisele Lobo Pappa. 2016. Evolutionary rank aggregation for recommender systems. In 2016 IEEE Congress on Evolutionary Computation (CEC). 255–262.

ERA receives as input a path containing a set of files with the rankings generated by different recommendation algorithms. In this files, each line represents an user id and the list of items that were recommended to this respective user by a recommender. By default we consider that the ranking file names are in the form [part]-[alg_name].out. For example, the file u3-WRMF.out represents the rankings generated by the recommendation algorithm WRMF to the users in the third partition of a recommendation dataset.

Bellow we show a ranking file example.

1 [43:0.98,33:0.79,8:0.77,12:0.45,19:0,02]

2 [47:0.81,3:0.72,94:0.77,45:0.66,18:0,12]

3 [91:0.92,12:0.79,3:0.79,22:0.35,10:0,24]

4 [12:0.99,33:0.91,1:0.52,11:0.45,19:0,22]

...

N [14:0.82,30:0.89,8:0.74,122:0.55,1:0,15]

It worth notice that this is the default output for the My Media Lite item recommendation algorithms. Besides the path to the rankings ERA can use the following parameters

  -h, --help             show this help message and exit


  -b BASE_DIR, --base_dir BASE_DIR
                         Folder containing the datasets to be  used  in  the  rank  aggregation.  could  indicate  a  folder  containing sets of rankings or a folder containing plain files where the features are already computed

  -o OUT_DIR, --out_dir OUT_DIR
                         Folder to save ERA outputs (default: ./outputERA/)

  -p PART, --part PART   Partition to be used in the aggregation (default: u1)

  --pini PINI            The partiton to start execution (default: 1)

  --pend PEND            The partiton to end execution (default: 5)

  --init_run INIT_RUN    The run to begin execution (Controls which set of seed will be used) (default: 0)

  --i2use I2USE          The number of items of each ranking to be used by the aggregation algorithm (default: 20)

  --i2sug I2SUG          The number of items to be saved in the aggregated ranking. It also controls the number of items used in the fitness function
                         (default: 10)

  -g NUMG, --numg NUMG   Number of generations to be used in the evolutionary process (default: 100)

  -i NUMI, --numi NUMI   Number of individuals to be used in the evolutionary process (default: 50)

  -m MUT, --mut MUT      Mutation probability. Use range [0-1] (default: 0.25)

  -x XOVER, --xover XOVER
                         Crossover probability. Use range [0-1] (default: 0.65)

  -r REP, --rep REP      Mutation probability. Use range [0-1] (default: 0.1)

  -u UNDER, --under UNDER
                         Indicates the percentage of irrelevant instaces to remove from each user. (default: 0.6)

  --max_iter MAX_ITER    Number of generation without fitness improvement (default: 50)

  --max_depth MAX_DEPTH  The maximum number of tree levels. An individual is a mathematical expression represented as a tree (default: 10)

  --min_depth MIN_DEPTH  The minimum number of tree levels. An individual is a mathematical expression represented as a tree (default: 2)

  --no_bkp               Dissable best individuals backup  (default: false)

  -K K                   Tournament size. (Notice the parameter is a capital K (default: 7)

  --nruns NRUNS          Number of runs to be used (default: 5)

  --nthreads NTHREADS    The number of threads to be used during reproduction and evaluation (ECJ multithread) (default: 1)

  --no_GP                Dissable GP execution. Can be used to just save the atributes or to reeval the individuals (default: false)

  --no_outrank           Dissable the construction of the feature Outrank. (default: false)

  --param PARAM          File containing the ECJ parameters to be used (default: ./params/gpra.params)

  --grs [GRS [GRS ...]]  If set, the program will run in the Group Aggregation Setting. It can receives a list of Recommenders to be used in the group
                         aggregation or left empty to use all the recomenders (.out) in the base folder

  --groups_file GROUPS_FILE
                         File contaning the groups information (default: )