Issue in the initial steps of data transformation and exploration
Closed this issue · 7 comments
Hello, I am facing this problem already in the beginning. I am not a big coder but I know how to work on R. I am unable to pin point the issue. Not sure what is different, as I am adapting exactly what has been mentioned in the workflow. Any help is great appreciated.
Thank you so much,
CyTOF analysis
Sudip Das
6/3/2020
Experiment information:
- THP-1 monocytic cells are differentiated into macrophages and
stimulated with various stimuli.
Sample information:
- SDCY1. Monocytes only (undifferentiated).
- SDCY2. Monocytes + DMSO
- SDCY3. Macrophages - Untreated.
- SDCY4. Macrophages + Lipopolysaccharide (LPS)
- SDCY5. Macrophages + Peptidoglycan (PGN)
- SDCY6. Macrophages + LPS + PGN
- SDCY7. Macrophages + Glucocorticosteroid mixture (GCM) for
immunosuppression.
Instrument and QC:
- CyTOF performed at FCCF-CyTOF, EPFL, Lausanne.
- CyTOF tuning (QC)- Yes
- Normalization -Yes (Fluidigm CyTOF software)
- Cleaning - Bead removal with FlowJo
- Gating - Yes (FlowJo) Cleaning has been performed
Debarcoding -
- Yes (Fluidigm CyTOF software)
- Minimum barcode separation : 0.04
- Yield : 67.80 percent
Technical validation - OK
7 FCS files are sent along with the report by SDA (FCCF) on 5/06/20 QC
has been performed by JD (bioinformatician, FCCF) on 4/06/20.
- SDCY1 (223100 cells)
- SDCY2 (222942 cells)
- SDCY3 (24426 cells)
- SDCY4 (23367 cells)
- SDCY5 (21016 cells)
- SDCY6 (28258 cells)
- SDCY7 (35546 cells)
Version info:
Rstudio Version 1.2.5033
R.version
## platform x86_64-apple-darwin15.6.0
## arch x86_64
## os darwin15.6.0
## system x86_64, darwin15.6.0
## version.string R version 3.6.3 (2020-02-29)
## nickname Holding the Windsock
Setting up directory
setwd("~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF")
Installing and loading packages.
All packages are reinstalled to new version
library(devtools)
## Loading required package: usethis
library(flowCore)
library(cytofCore)
library(readxl)
library(ggplot2)
library(reshape2)
library(matrixStats)
library(limma)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:matrixStats':
##
## count
## The following object is masked from 'package:flowCore':
##
## filter
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggrepel)
library(RColorBrewer)
library(pheatmap)
Saving data and loading
save.image("cytof_exp1.RData")
Data import method Robinson lab
Reading metadata file
md <- read_excel("~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF/macros_metadata.xlsx")
Make sure condition variables are factors with the right levels
md$condition <- factor(md$condition, levels = c("undiff", "dmso","diff","lps","pgn","lps + pgn","gcm"))
data.frame(md)
## file_name sample_id condition patient_id
## 1 monos.fcs SDCY1 undiff exp1
## 2 monos_dmso.fcs SDCY2 dmso exp1
## 3 macros.fcs SDCY3 diff exp1
## 4 macros_lps.fcs SDCY4 lps exp1
## 5 macros_pgn.fcs SDCY5 pgn exp1
## 6 macros_lps_pgn.fcs SDCY6 lps + pgn exp1
## 7 macros_gcm.fcs SDCY7 gcm exp1
Define colors for conditions
color_conditions <- c("#DC143C", "#228B22","#008B8B","#0000CD","#800080","#D2691E","#778899")
names(color_conditions) <- levels(md$condition)
Reading FCS filename
fcs_raw <- read.flowSet(md$file_name, path="~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF/20200603_Sudip_Human_Microbiota",transformation = FALSE, truncate_max_range = FALSE)
fcs_raw
## A flowSet with 7 experiments.
##
## column names:
## Au197Di BCKG190Di Ba138Di Bi209Di Ce140Di Center Cs133Di Dy161Di Dy162Di Dy163Di Dy164Di Er166Di Er167Di Er168Di Er170Di Eu151Di Eu153Di Event_length Gd155Di Gd156Di Gd157Di Gd158Di Gd160Di Ho165Di I127Di In113Di In115Di Ir191Di Ir193Di La139Di Lu175Di Nd142Di Nd143Di Nd144Di Nd145Di Nd146Di Nd148Di Nd150Di Offset Pb208Di Pd102Di Pd104Di Pd105Di Pd106Di Pd108Di Pd110Di Pr141Di Pt194Di Pt195Di Pt198Di Residual Rh103Di Sm147Di Sm149Di Sm152Di Sm154Di Sn120Di Tb159Di Tm169Di Xe131Di Y89Di Yb171Di Yb172Di Yb173Di Yb174Di Yb176Di Time
Antigen information
panel <- read_excel("~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF/antigen_panel.xlsx")
data.frame(panel)
## Metal Isotope Antigen Lineage Functional
## 1 Dy 164 Siglec-8 0 1
## 2 Er 168 CD206 0 1
## 3 Er 170 CD169 0 1
## 4 Eu 151 CD11b 1 0
## 5 Gd 156 HLA-DR 0 1
## 6 Gd 160 CD14 1 0
## 7 Ho 165 CD64 0 1
## 8 Nd 142 CD11c 0 1
## 9 Nd 143 CD68 0 1
## 10 Nd 145 CD71 0 1
## 11 Nd 146 F4-80 0 1
## 12 Sm 154 PD-L1 0 1
## 13 Tm 169 CD163 0 1
## 14 Yb 171 CD86 0 1
## 15 Yb 173 CD81 0 1
## 16 Yb 174 CD88 0 1
Parameters
panel_fcs <- pData(parameters(fcs_raw[[1]]))
#panel_fcs<-na.omit(panel_fcs)
head(panel_fcs)
## name desc range minRange maxRange
## $P1 Au197Di 197Au 4096 0 4095
## $P2 BCKG190Di 190BCKG 4096 0 4095
## $P3 Ba138Di 138Ba 8192 0 8191
## $P4 Bi209Di 209Bi 4096 0 4095
## $P5 Ce140Di 140Ce 12000 0 11999
## $P6 Center <NA> 12000 0 11999
Data transformation Robinson lab
Lineage markers
(lineage_markers <- panel$Antigen[panel$Lineage == 1])
## [1] "CD11b" "CD14"
Functional markers
(functional_markers <- panel$Antigen[panel$Functional == 1])
## [1] "Siglec-8" "CD206" "CD169" "HLA-DR" "CD64" "CD11c"
## [7] "CD68" "CD71" "F4-80" "PD-L1" "CD163" "CD86"
## [13] "CD81" "CD88"
arcsinh transformation and column subsetting
fcs <- fsApply(fcs_raw, function(x, cofactor = 5){
colnames(x) <- panel_fcs$desc
expr <- exprs(x)
expr <- asinh(expr[, c(lineage_markers, functional_markers)] / cofactor)
exprs(x) <- expr
x
})
fcs
## Error in expr[, c(lineage_markers, functional_markers)] : subscript out of bounds
Diagnostic plots Robinson lab
Generate sample IDs corresponding to each cell in the expr
matrix
sample_ids <- rep(md$sample_id, fsApply(fcs_raw, nrow))
head(sample_ids)
## [1] "SDCY1" "SDCY1" "SDCY1" "SDCY1" "SDCY1" "SDCY1"
tail(sample_ids)
## [1] "SDCY7" "SDCY7" "SDCY7" "SDCY7" "SDCY7" "SDCY7"
ggdf <- data.frame(sample_id = sample_ids, expr)
ggdf <- melt(ggdf, id.var = "sample_id",
value.name = "expression", variable.name = "antigen")
mm <- match(ggdf$sample_id, md$sample_id)
ggdf$condition <- md$condition[mm]
ggplot(ggdf, aes(x = expression, color = condition,
group = sample_id)) +
geom_density() +
facet_wrap(~ antigen, nrow = 4, scales = "free") +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1),
strip.text = element_text(size = 7), axis.text = element_text(size = 5)) +
scale_color_manual(values = color_conditions)
## Error in data.frame(sample_id = sample_ids, expr) : arguments imply differing number of rows: 578655, 0
Extract expression
expr <- fsApply(fcs, exprs)
dim(expr)
## Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘fsApply’ for signature ‘"flowFrame"
rng <- colQuantiles(expr, probs = c(0.01, 0.99))
expr01 <- t((t(expr) - rng[, 1]) / (rng[, 2] - rng[, 1]))
expr01[expr01 < 0] <- 0
expr01[expr01 > 1] <- 1
## Error: Argument 'x' is of class ‘function’, but should be a matrix. The use of a ‘function’ is not supported, the correctness of the result is not guaranteed. Please update your code accordingly.
What version of the workflow are you following? It seems to me this code is at least 3 years old. All of the above should come down to <10 lines with the current version.
I am following this link:
http://129.217.206.11/packages/3.7/workflows/vignettes/cytofWorkflow/inst/doc/cytofWorkflow.html
Aha- may I suggest either this (short version): https://bioconductor.org/packages/release/bioc/vignettes/CATALYST/inst/doc/differential.html or this (long version): https://f1000research.com/articles/6-748/v4
The former should be much easier, if you are not too familiar with R. The later will give more theoretical background on each step.
Great thanks a lot ! I will give this a go and get back if there are any such issues.
Also, I just noticed you are running R 3.6. I can highly recommend updating to 4.0+ to take advantage of the new features, and make sure you're not writing an analysis that is already "out-dated".
Agree with @HelenaLC .. these code snippets are very old. I would follow the workflow from BioC 3.11 (current Bioconductor release from April-October 2020):
https://bioconductor.org/packages/3.11/workflows/vignettes/cytofWorkflow/inst/doc/cytofWorkflow.html
(actually, the v4 link above is even a bit outdated)
Note: this will require you to upgrade to R 4.0 as well.
Thank you for your help! I will follow up on all of this.