Tune on non-seeds?
deklanw opened this issue · 2 comments
Is it possible to run the tuners with non-seed nodes? For example if I have a seed_set
and a target_set
can I run the tuner diffusions with the signal from the former but optimize for metrics defined with respect to the latter? In this case I have a desired ranking of the nodes in the target_set
.
This is a useful use case that should have probably been explicitly supported by the interface. Right now, you need to create a lambda expression to be used as a measure constructor that always returns the same instance:
measure = pg.AUC(target_set) # or spearman or pearson correlation if your target set is non-binary
algorithm = pg.ParameterTuner(measure=lambda *args: measure, fraction_of_training=1) # uses 100% of seeds when running algorithms internally
print(algorithm(graph, seed_set))
P.S. For large graphs, you might be interested in algorithm selection instead of granular parameter tuning:
competing_algorithms = pg.create_many_filters().values()
algorithm = pg.AlgorithmSelection(competing_algorithms , measure=lambda *args: measure, fraction_of_training=1)
This will just run through a predetermined set of filters. You can provide a custom list of algorithms you suspect will work well if you have some specific ones in mind. Note that you can also create algorithms with binary outcomes like this: pg.HeatKernel(3) >> pg.Threshold(0.1)
though you shouldn't use AUC to evaluate binary outcomes.
Thanks, this is perfect