This repository contains the implementation of the Evolutionary Rank Aggregation (ERA) method proposed in
Samuel Oliveira, Victor Diniz, Anisio Lacerda, and Gisele Lobo Pappa. 2016. Evolutionary rank aggregation for recommender systems. In 2016 IEEE Congress on Evolutionary Computation (CEC). 255–262.
ERA receives as input a path containing a set of files with the rankings generated by different recommendation algorithms. In this files, each line represents an user id and the list of items that were recommended to this respective user by a recommender. By default we consider that the ranking file names are in the form [part]-[alg_name].out. For example, the file u3-WRMF.out represents the rankings generated by the recommendation algorithm WRMF to the users in the third partition of a recommendation dataset.
Bellow we show a ranking file example.
1 [43:0.98,33:0.79,8:0.77,12:0.45,19:0,02]
2 [47:0.81,3:0.72,94:0.77,45:0.66,18:0,12]
3 [91:0.92,12:0.79,3:0.79,22:0.35,10:0,24]
4 [12:0.99,33:0.91,1:0.52,11:0.45,19:0,22]
...
N [14:0.82,30:0.89,8:0.74,122:0.55,1:0,15]
It worth notice that this is the default output for the My Media Lite item recommendation algorithms. Besides the path to the rankings ERA can use the following parameters
-h, --help show this help message and exit
-b BASE_DIR, --base_dir BASE_DIR
Folder containing the datasets to be used in the rank aggregation. could indicate a folder containing sets of rankings or a folder containing plain files where the features are already computed
-o OUT_DIR, --out_dir OUT_DIR
Folder to save ERA outputs (default: ./outputERA/)
-p PART, --part PART Partition to be used in the aggregation (default: u1)
--pini PINI The partiton to start execution (default: 1)
--pend PEND The partiton to end execution (default: 5)
--init_run INIT_RUN The run to begin execution (Controls which set of seed will be used) (default: 0)
--i2use I2USE The number of items of each ranking to be used by the aggregation algorithm (default: 20)
--i2sug I2SUG The number of items to be saved in the aggregated ranking. It also controls the number of items used in the fitness function
(default: 10)
-g NUMG, --numg NUMG Number of generations to be used in the evolutionary process (default: 100)
-i NUMI, --numi NUMI Number of individuals to be used in the evolutionary process (default: 50)
-m MUT, --mut MUT Mutation probability. Use range [0-1] (default: 0.25)
-x XOVER, --xover XOVER
Crossover probability. Use range [0-1] (default: 0.65)
-r REP, --rep REP Mutation probability. Use range [0-1] (default: 0.1)
-u UNDER, --under UNDER
Indicates the percentage of irrelevant instaces to remove from each user. (default: 0.6)
--max_iter MAX_ITER Number of generation without fitness improvement (default: 50)
--max_depth MAX_DEPTH The maximum number of tree levels. An individual is a mathematical expression represented as a tree (default: 10)
--min_depth MIN_DEPTH The minimum number of tree levels. An individual is a mathematical expression represented as a tree (default: 2)
--no_bkp Dissable best individuals backup (default: false)
-K K Tournament size. (Notice the parameter is a capital K (default: 7)
--nruns NRUNS Number of runs to be used (default: 5)
--nthreads NTHREADS The number of threads to be used during reproduction and evaluation (ECJ multithread) (default: 1)
--no_GP Dissable GP execution. Can be used to just save the atributes or to reeval the individuals (default: false)
--no_outrank Dissable the construction of the feature Outrank. (default: false)
--param PARAM File containing the ECJ parameters to be used (default: ./params/gpra.params)
--grs [GRS [GRS ...]] If set, the program will run in the Group Aggregation Setting. It can receives a list of Recommenders to be used in the group
aggregation or left empty to use all the recomenders (.out) in the base folder
--groups_file GROUPS_FILE
File contaning the groups information (default: )