immunogenomics/cna

What is the meaning of `batch`

altairwei opened this issue · 6 comments

Hi,

I had some problems in understanding the batch. In my case, samples were grown under three conditions (control, treatment 1, and treatment 2), and then I obtained samples in three time points (1 to 3 days), with two replicates of each kind. However each sample was grown, sampled and sequenced at a different time. In this sense, each sample is a batch.

The final sample information is shown in the following table:

SampleID Time Treatment Replicate
S1 1 ctrl 1
S2 1 ctrl 2
S3 1 stim1 1
S4 1 stim1 2
S5 1 stim2 1
S6 1 stim2 2
...

In this case, is there a BATCH?

When I run the following code, which column can be considered as batch

res = cna.tl.association(d,                   #dataset
            d.samplem.case,                   #sample-level attribute of intest (case/control status)
            covs=d.samplem[['male']],       #covariates to control for (in this case just one)
            batches=d.samplem.batch)        #batch assignments for each sample so that cna can account for batch effects

@yakirr Thank you for your reply! In the case I mentioned above, the sample attributes I am interested in are Time or Treatment.

@yakirr There are no "same" samples in the biological sense, because they are all grown independently. But in the sense of experimental design, there are two samples (biological replicates) with the same treatment at the same time point, and they can perhaps be called "same sample", right?

@yakirr So my case does not include batch. Thanks for your patience!