erzakiev/frictionless

Effective penalization of the objective function to avoid fixing the size of the signature

erzakiev opened this issue · 0 comments

When the signature size is not fixed at a certain number, we faced a problem of explosive growth of the number of combinations in which a signature can be constructed (the issue is discussed in the study of a similar problem in the case subnetworks).

We tried penalization of the Friedman's S for each sample by dividing it with the number of genes in the solution n exponentiated to some real power alpha [0.1 ... 3]: S'=S/(n^alpha), yet it either produced very big signatures (read half or more of the input gene universe) with the lower values of alpha, or reduced the signatures to the minimum allowed number (3 in our case) in case of higher alphas.

We invite anyone to contribute their ideas on how to properly penalize the statistic so that the signatures don't expand to the totality of the inputted gene space yet don't collapse to the minimum number of genes allowed (i.e. 3).