Error for certain inputs in `calculate_sample_to_sample_MMDs`
Closed this issue · 1 comments
Hi Will
Thought was trying your package and it looks very promising. Unfortunately I encountered two small issues:
calculate_sample_to_sample_MMDs
fails when less than 6 datasets are present. This seems to be the case because .make_mmd_graph
has a hardcoded number of neighbors (n_nhbrs = 5
) which introduces NA
's when less then this number of datasets are available which in turn gives an error in igraph::graph_from_edgelist
.
When sample_id
is in annots_disc
I get the following error:
Error: names of annot_disc in colData(qc_obj) should match metadata(qc_obj)$annots$disc
if renaming to patient_id
instead it works fine. Also:
> colnames(colData(qc_obj)$annot_disc[[1]])
[1] "group_id" "sample_id.1" "condition" "N_cat" "mito_cat" "counts_cat"
> metadata(qc_obj)$annots$disc
[1] "group_id" "sample_id" "condition" "N_cat" "mito_cat" "counts_cat"
So I assume somewhere in the code a duplicate column of sample_id
is created.
Best,
Reto
Hi @retogerber
Thanks for reporting this.
The use case I've had in mind for SampleQC
has generally been for large, complex experiments, so at present it doesn't work so well for smaller datasets (and in your case, it doesn't work at all). It still makes sense to be able to run the fitting part of SampleQC
on smaller datasets, so I need to make some tweaks to allow that to work (most likely I will just remove the graph construction and clustering step and replace them with defaults).
The make_qc_dt
function also needs to be made more robust... Thanks for adding a useful example of where it can fall over ;)
Cheers
Will