sistm/cytometree

How to determine whether KDE perform accurately for a given distribution using 1 marker?

Closed this issue · 1 comments

Hi there,

Thank you for this useful tool!

I want to determine the gate on 1-D (i.e., on a histogram of 1 channel only - CD4), so I ran F37_sce_c14_cytometree <- cytometree::CytomeTree(t(assay(F37_sce_c14_cytometree)[c("CD4", "CD185"),]), minleaf = 1, t = 0.1, force_first_markers=c("CD4"))

Q1. The code ran fine, but would you mind advising how we might determine whether we can trust the KDE's? And whether it is possible to also draw the normalised curve that actually show the final partition?

From FlowJo of the same cells:

Screenshot 2023-09-07 at 18 54 03

Q2. With the code written as above, does that mean the partition of the cells is only determined using CD4 and not CD185? Even though I saw the tree (below) only shows splitting of CD4 and NAs are for the CD185 column, but wanted to make sure, because when I plotted cytometree::Annotation, I saw plots for both CD4 and CD185 (Gray-shaded plots)

Screenshot 2023-09-07 at 18 28 13

CD4:
Screenshot 2023-09-07 at 18 33 48

CD185:
Screenshot 2023-09-07 at 18 37 10

Screenshot 2023-09-07 at 18 37 56

Thank you for your help!

Hi,

Thank you for your interest in cytometree.

Q1. The kernel density estimator does not look bi-modal on your data for CD4, as is also the case in log-scale according to your FlowJo screen capture. As the KDE is fully non-parametric, as long as you have enough, cells it is trustworthy. In your case, it does not look like you have two cell populations that are expressing different levels of CD4.

Q2. Your code only forces CD4 to be used for the first split but does not formally prevent CD185 to be used afterwards to split your cells into more populations. Yet, looking at your output for $combinations (with NAs in the CD185 column) , it was not used to further split the cells — likely because it did not meet the t=0.1 threshold — hinting as your data having only one cell population homogeneous in CD185 once splitted in two according to CD4.
cytometree::Annotation will characterize each cell population identified for all markers, regardless of wether those markers are actually used or not to split the cells.

Best,
Boris