aertslab/cisTopic

maximal median size of population peaks

robinweide opened this issue · 1 comments

Hi there,

I was wondering if there is any reason not to use your wonderful method on data with large "peaks". In other words: are there assumptions that I would break if I used a set of population peaks > 50kb?

Thanks,
Robin

Hi!

I haven't tried myself which such long peaks, but I guess that it can affect it indeed. We normally use the binarized matrix (by default that you have al least 1 read in the peak) so with such long regions it is possible that you may find at least a read by chance.

What we are mostly using right now to avoid using directly bulk peaks (since by doing this you may loose peaks of rare populations) is to use the SCREEN regions (for mouse/human: https://www.nature.com/articles/s41586-020-2493-4) and cisTarget regions (for drosophila, these ones are within the package , you can load them with data("dm6_CtxRegions")). Then after this first clustering you can always call consensus peaks and rerun.

I hope this is useful!

C