MKLab-ITI/pygrank

Tune on non-seeds?

deklanw opened this issue · 2 comments

Is it possible to run the tuners with non-seed nodes? For example if I have a seed_set and a target_set can I run the tuner diffusions with the signal from the former but optimize for metrics defined with respect to the latter? In this case I have a desired ranking of the nodes in the target_set.

This is a useful use case that should have probably been explicitly supported by the interface. Right now, you need to create a lambda expression to be used as a measure constructor that always returns the same instance:

measure = pg.AUC(target_set)  # or spearman or pearson correlation if your target set is non-binary
algorithm = pg.ParameterTuner(measure=lambda *args: measure, fraction_of_training=1)  # uses 100% of seeds when running algorithms internally
print(algorithm(graph, seed_set))

P.S. For large graphs, you might be interested in algorithm selection instead of granular parameter tuning:

competing_algorithms = pg.create_many_filters().values()
algorithm = pg.AlgorithmSelection(competing_algorithms , measure=lambda *args: measure, fraction_of_training=1)

This will just run through a predetermined set of filters. You can provide a custom list of algorithms you suspect will work well if you have some specific ones in mind. Note that you can also create algorithms with binary outcomes like this: pg.HeatKernel(3) >> pg.Threshold(0.1) though you shouldn't use AUC to evaluate binary outcomes.

Thanks, this is perfect