`_load_sample_metadata` in `mofa_model` produces misaligned metadata
Closed this issue · 1 comments
After finding a couple of plotting features that work not as expected I resorted to write my own plotting functions. In the process of doing so I realised that the metadata I was relying on was completely misaligned with the samples. After reading through a couple of module I found that this most likely results from using pd.concat
in _load_sample_metadata
which does respect any alignment by default. Specifically, this is due to the way the mofa_model
class is initialised, which uses the contents of model['samples']
to initialise the samples
property and model['groups
] to initialise the groups
property. While model['samples']
contains samples ordered by their group assignment in ascending alphabetical order and is used by _load_sample_metadata
to initialise the returned metadata frame, model['groups']
is not sorted. However, the metadata is loaded from model['samples_metadata']
by iterating over the groups
property which produces a new data frame that does not align with the original frame generated from the samples
property. Thus, merging those to by simply appending the columns of the frame generated from groups
to those generated by samples
(which is what pd.concat
does) yields non-sensical metadata and in turn might produce wrong interpretations of the results of MEFISTO.
I would suggest to either use pd.join
or pd.DataFrame.merge
to ensure the metdata is aligned properly.