Change signatures matrix orientation in fit_nmf

Question

Change signatures matrix orientation in fit_nmf

Closed this issue 7 years ago · 2 comments

In order to make the models more consistent, I'm changing the orientation of the signatures matrix in sigfit_fit_nmf.stan, so that it is SxC, like in all other models, instead of CxS. This will imply changing the way in which the probabilities are calculated, from the current:

matrix[C, S] signatures;  // in "data"
simplex[S] exposures[G];  // in "parameters"
vector<lower=0, upper=1>[C] probs[G];  // in "transformed parameters"
    for (i in 1:G) {
        probs[i] = scale_to_sum_1(signatures * exposures[i]);
}

To something like:

matrix[S, C] signatures;  // in "data"
simplex[S] exposures[G];  // in "parameters"
vector<lower=0, upper=1>[C] probs[G];  // in "transformed parameters"
    for (i in 1:G) {
        probs[i] = scale_to_sum_1(exposures[i] * signatures); // need to find out how to do this product in stan
}

This way, the input and output matrices will always follow the form:

signatures: SxC
exposures: GxS
counts: GxC
opportunities: GxC

Answer 1 · 2017-08-05T10:35:18.000Z

I've implemented this as follows, based on sigfit_fitex_nmf.stan:

data {
    ...
    matrix[S, C] signatures;  // matrix of signatures (rows) to be fitted
    int counts[G, C];         // data = counts per category (columns) per genome sample (rows)
    ...
}
parameters {
    simplex[S] exposures[G];
}
transformed parameters {
    matrix[G, S] exposures_mat;
    matrix<lower=0, upper=1>[G, C] probs;
    for (i in 1:G) {
        for (j in 1:S) {
            exposures_mat[i, j] = exposures[i, j];
        }
    }
    probs = exposures_mat * signatures;
}

The signatures input for fitting will always be normalised to sum to 1, as this is done in the remove_zeros_() function.

Answer 2 · 2017-08-05T23:21:37.000Z

This generated a compiler error when calculating the multinomial likelihood, which takes a vector not a row_vector. I've fixed this on master.

…

On 5 Aug 2017, at 11:35, Adrian Baez-Ortega ***@***.***> wrote: I've implemented this as follows, based on sigfit_fitex_nmf.stan: data { ... matrix[S, C] signatures; // matrix of signatures (rows) to be fitted int counts[G, C]; // data = counts per category (columns) per genome sample (rows) ... } parameters { simplex[S] exposures[G]; } transformed parameters { matrix[G, S] exposures_mat; matrix<lower=0, upper=1>[G, C] probs; for (i in 1:G) { for (j in 1:S) { exposures_mat[i, j] = exposures[i, j]; } } probs = exposures_mat * signatures; } The signatures input for fitting will always be normalised to sum to 1, as this is done in the remove_zeros_() function. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#16 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABkM_-hTDuIzFEpLmUmwOmGkmoT4bG4Pks5sVEVmgaJpZM4OubFz>.