Matching clusters to cell-types
Closed this issue · 3 comments
Hello,
Excellent resource! Using this data in an inter-species comparison project.
Some issues I've encountered along the way.
Row names are missing from the clustering data.frames of some .loom files.
I downloaded the Aerts_Fly_AdultBrain_Filtered_57k.loom
file from the SCope website, and tried to extract the clusters, but encountered the following error:
loom_file <- "raw_data/fly/Aerts_Fly_AdultBrain_Filtered_57k.loom"
loom <- SCopeLoomR::open_loom(file.path = loom_file,
mode = "r+")
clusters <- SCopeLoomR::get_clusterings_with_name(loom)
----
Error in get_global_meta_data(loom = loom)$clusterings[[i]] :
subscript out of bounds
I believe this is because the clusters in this .loom file is missing the row names (cell IDs) (get_clusterings()
is able to get the clusters data.frame ok, just without row names). This is an issue because the cluster data.frame rows are in a different order than the cell metadata rows extracted with SCopeLoomR::get_cell_annotation()
. I was able to confirm this with the Aerts_Fly_AdultBrain_Unfiltered_157k.loom
file, which does have row names included in its clustering data.frame.
So you have to merge on the cell IDs, rather than just using cbind()
(which just scrambles all the cell-type annotations). Could you add the cell IDs as row names to allow proper matching?
Annotating clusters with cell-types
For the aforementioned Aerts_Fly_AdultBrain_Unfiltered_*
.loom files (and perhaps others) would it be possible to include the cell-types identified in the original publication directly within the metadata? Or at least provide machine-readable files associated with each .loom file that one could use to do this cell-type annotation?
I was able to get some of the cluster cell-types from the supp materials in the Davie et. al. 2018, but I'm still unsure if I'm matching it with the correct combination of version (Unfiltered_157k vs. Filtered_57k), dimensionality reduction method (Seurat t-SNE vs. SCENIC), and clustering resolution.
Many thanks in advance!,
Brian Schilder
Imperial College London
Hi @bschilder ,
Thanks for reaching out and showing interest for SCope
and the fly brain ageing single-cell dataset.
For the missing row names, I'll check this out. Since this issue is related to the SCopeLoomR
, I moving this to its github repository: aertslab/SCopeLoomR#28.
Regarding your second point about the cell-type information, there is a cell-based file available on GEO containing this metadata: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE107451 > GSE107451_DGRP-551_w1118_WholeBrain_57k_Metadata.tsv.gz
> annotation
column.
Hi @bschilder,
Version v0.10.2
of SCopeLoomR
should fix the issues your were having