Recommendations to compute normalizations with severals samples and one input
Closed this issue · 1 comments
I'm testing WiggleTools since a few days and I was wondering if you have advices to give on the normalisation step. Does Ensembl recommand some specific process ?
I mean for example when you have done ChipSeq for one histone mark. You have 2 replicates(or more) and one input. All are in bam format.( from bwa alignment for example) And finally you want only one wiggle "normalised" to display on viewer or to make some plots to profile the signal around several coordinates.
You want at the end one wiggle file normalised in RPKM (I know it's what deeptools is doing) or RPM (just number of mapped reads, is what I'm doing).
I was thinking to try the following, seems ok ? thanks :
Normalise Replicate 1 :
wiggletools write Rep1.normalised scale 1/TotalMappedReads scale 1000000 Rep1.bam
NormaliseReplicate 2 :
wiggletools write Rep2.normalised scale 1/TotalMappedReads scale 1000000 Rep1.bam
Normalise Control :
wiggletools write Control.normalised scale 1/TotalMappedReads scale 1000000 Rep1.bam
Final normalisation to get 1 wiggle file :
wiggletools write mean.normalised.posOnly.wig trim lengths.bed gt 0 diff mean Rep1.normalised .wig Rep2.normalised.wig : Control.normalised.wig
How would you have done if for RPKM ?
Hello @LucoLab,
What you suggest is perfectly sensible, and we do it in our pipelines. A more advanced approach would take into account the mappability of the different regions of the genome, as done in align2rawsignal.
HTH,