markrobinsonuzh/cytofWorkflow

facet_grid with scale_color_gradientn results in incorrect tSNE plots

Closed this issue · 4 comments

(Let me know if you prefer I submit issue with ggplot instead of here.)
I wanted to create tSNE plots with specific markers, but facet them according to condition (timepoint) and condition2 (clinical response). I have used both facet_wrap and facet_grid, with either (condition ~ condition2), (~condition), or (sample_id ~ condition2) where appropriate. I get the correct tSNE plots for only two of these three conditions, and am desperately trying to figure out if it is a script error on my part or a bug. Example plots are shown below, in order of the script below. Thank you in advance.

GitHub inquiry WorkFlow.pptx

----tsne-plot-with specific marker,

Plot t-SNE colored by specific marker

dr <- data.frame(tSNE1 = tsne_out$Y[, 1], tSNE2 = tsne_out$Y[, 2],
expr[tsne_inds, lineage_markers])
dr$sample_id <- sample_ids[tsne_inds]
mm <- match(dr$sample_id, md$sample_id)
dr$condition <- md$condition[mm]

add extra conditions

dr$condition2 <- md$condition2[mm]
dr$cell_clustering1 <- factor(cell_clustering1[tsne_inds], levels = 1:nmc)

tSNE all markers separately (select only numeric data)

pdf(glue("./{out_dir}/tsne_by_markers.pdf"))
for (i in lineage_markers){
tsne_intensity<- ggplot(dr, aes(x = tSNE1, y = tSNE2, color = dr[,i])) +
geom_point(size = 1.0) +
theme_bw() +
scale_color_gradientn(i,
colours = colorRampPalette(rev(brewer.pal(n = 11, name = "Spectral")))(50))
print(tsne_intensity)
}
dev.off()

tSNE all markers by condition and condition2

pdf(glue("./{out_dir}/tsne_by_markers_by_condition2.pdf"))
for (i in lineage_markers){
tsne_intensity<- ggplot(dr, aes(x = tSNE1, y = tSNE2, color = dr[,i])) +
facet_wrap(~ condition2) +
geom_point(size = 1.0) +
theme_bw() +
scale_color_gradientn(i,
colours = colorRampPalette(rev(brewer.pal(n = 11, name = "Spectral")))(50))
print(tsne_intensity)
}
dev.off()

tSNE all markers by condition2

pdf(glue("./{out_dir}/tsne_by_markers_by_condition_and_condition2.pdf"))
for (i in lineage_markers){
tsne_intensity<- ggplot(dr, aes(x = tSNE1, y = tSNE2, color = dr[,i])) +
facet_grid(condition ~ condition2) +
geom_point(size = 1.0) +
theme_bw() +
scale_color_gradientn(i,
colours = colorRampPalette(rev(brewer.pal(n = 11, name = "Spectral")))(50))
print(tsne_intensity)
}
dev.off()

I don't see a "bug" per se, but maybe you can try the following:
I personally never color by a numeric vector (here the expression of a specific marker), but instead specify then variable in the data.frame to color by. As you're using a loop, this means passing the marker name as a character string to color in aes(). I.e. something like:

for (i in lineage_markers) {
  ggplot(dr, aes_string("tSNE1", "tSNE2", col = i)) + ...
}

Not sure this will solve anything or just give you the same plot as before... but maybe it's useful anyways to know that you can pass a character string to aes_string() rather than symbolics or numeric vectors to aes.

In other news, I'm not sure which version of the workflow / package you are using, but I remember tsne_inds from a long time ago. If you use a version based on the SingleCellExperiment (SCE) class, you can visualize dimensionality reductions simply by commands such as plotDR(sce, "TSNE", color_by = c("CD1", "CD2", ...), facet_by = "condition") or plotDR(sce, "TSNE", color_by = "CDx", facet_by = c("condition", "patient_id")) etc., to visualize e.g., 10 markers facetted by 1 variable, or 1 marker facetted by to 2 variables etc. etc. See the CATALYST vignette for some examples of the different coloring/faceting options supported.

Thanks Helena - gave it a whirl but the graph was the same. I will try your suggestion to visualize a subset of markers. Thanks for the suggestion.

Hi Helena: update on the facet_grid/scale_color_gradientn issue: updating ggplot2 and dplyr completely fixed the issue. :)

Awesome, thanks for the update! Glad to hear this was not bug on our side that could have messed with any users’ plots ;) - as a side note, I noticed that the cell ordering in the SCE can also give “illusions” in reduced dimensions plots. Might be worth shuffling the cells right after creating the SCE to protect against this.
Also, you might still be interested in trying out plotDR() which is quite flexible in plotting multiple markers and faceting by 1-2 metadata variables.