################################################### # # Dave Gerrard, University of Manchester # 2011 # ################################################### ## Epigenome #files were obtained using wget within the relevant data directory wget -i sources.H3K4me3 #folders for output were created ####### Attempt 1: binCOunts from BigWig files. # bigWig (.bw) files were created using wigToBigWig batchCreateBws.sh -> createBwOnFiles.sh #bin counts were made using batchBinCounts.sh -> binCountOnFileBase.sh -> GetBinsFromBed.pl # took 30 hours for chr11. # NOT RECOMENDED:- # slow. # inaccurate (fixed width wig files do not store co-ordinates of reads only bins) # may be useful if need data from around specific points e.g. TSS # perhaps use featureBits instead ###### Attempt 2: regular interval binCounts from Bed # starting .bed.gz files are much larger (~1.5Gb uncompressed) # beds must be sorted (beware +/- strands in bed files.) # download files as above # copy and uncompress # split to separate chroms and sort each batchSplitBedGzToBedByChr.sh -> splitBedOnChr.sh # N.B. could use Kent: bedSplitOnChrom # run bincounting on each chrom batchBinCountsRegFromBed.sh -> binCountRegFromBed.sh -> countBedToRegBins.pl # re-combine individual chrom results? batchCatChromFilesInDir.sh -> catChromFilesInDir.sh # bind count results across samples bindGenomeBins.R #N.B. bedSort does not order chromosomes. ## Comparing two sets of binCounts. #deseq?