GregorySchwartz/too-many-cells

Strange warnings occurred during the procedure

MenghuaZhang86 opened this issue · 3 comments

Hello, thanks for developing the great tool for single-cell analysis.

I met some strange warnings during running the following script:

docker run --rm -v /data:/data --name toomanycells gregoryschwartz/too-many-cells:2.0.0.0 make-tree \
    --matrix-path /data/share/scRNAseq/results/6N-total/outs/filtered_feature_bc_matrix \
    -Z 6N-total \
    --matrix-path /data/share/scRNAseq/results/6N-Bcell/outs/filtered_feature_bc_matrix \
    -Z 6N-Bcell \
    --labels-file /data/TooManyCells/test2.labels.csv \
    --filter-thresholds "(250, 1)" \
    --draw-collection "PieRing" \
    --output /data/TooManyCells/out2 \
    > /data/TooManyCells/clusters2.csv

--matrix-path is the output directory of cellranger
-Z for the sample name, same with the labels in the labels-file, like this:
Screen Shot 0003-06-29 at 17 09 39
The test2.labels.csv contains all barcodes in 6N-total or 6N-Bcell samples.

The warnings are as follows:

[=======================>...................]  55%
Cell missing a label.
Warning: Problem in diversity, skipping cluster_diversity.csv output ...
[===========================>...............]  64%
Cell has no label: Id {unId = "AATCGGTTCTGCTGCT-1-6N-total"}
Warning: Problem in clumpiness, skipping clumpiness output ...
[===========================================] 100%

However, I can find this label AATCGGTTCTGCTGCT-1,6N-total in test2.labels.csv
Some of the results are missing, like clumpiness.csv, clumpiness.pdf, cluster_diversity.csv. dendrogram.svg is totally black and white and no color for distinguishing different samples (or labels?).

Could you tell me how to solve this? Thank you very much!

The issue is that you are forcing the cells to have new barcodes with -Z (so AAACCTGAGAGACTAT-1 would become AAACCTGAGAGACTAT-1-6N-total and have the label of 6N-total) but you have the (I assume) old barcodes in your labels file. So essentially, -Z was created to avoid having to make a labels file if each matrix was basically a sample and to avoid cases with overlapping barcodes (by appending the sample name to the barcode). So in this case, you can either not use the labels file or not use -Z (and preferably update your barcodes so they don't overlap, like -2 instead of -1 for instance).

Thank you very much! I got the result successfully.

BTW, I would like to know that if I want to focus on a subset of the cells in the tree, I should select the cells by myself at first, then run too-many-cells, right?

You can use --root-cut to create a new root in the tree, hopefully that is what you are looking for.