PoisonAlien/trackplot

Redefine chromHMM feature colors

User06511-lab opened this issue · 4 comments

Hi @PoisonAlien, thank you for creating this tool!

As my data is derived from mice, and UCSC does not provide a matching mm10 chromHMM bigBED file for my data type, I had to find an alternative source. This search led me to utilize this attached file: mESC_E14_12_dense.annotated.bed.gz, from guifengwei.

Although I can execute the commands successfully,

Details

#Path to bigWig files
bw_path <- "main_path/IGV_BW"
bigWigs <- c(file.path(bw_path, "file1.bw"),
            file.path(bw_path, "file2.bw"),
            file.path(bw_path, "file3.bw"),
            file.path(bw_path, "file4.bw"))
#Make a table of bigWigs along with ref genome build
bigWigs <- read_coldata(bws = bigWigs, build = "mm10")
#Region to plot
target_loci <- "chr3:129199878-129219590"
#Extract bigWig signal for a loci of interest
t <- track_extract(colData = bigWigs, loci = target_loci)

# Add cytoband and change colors for each track
track_cols <- colors

# Add peak information
peak_annotation <- c("main_path/standard.peaks") #Standardized Peak file
# Add chromHMM track based on mESC data
chromHMM_peaks <- "mESC_E14_12_dense.annotated.bed" # From https://github.com/guifengwei/ChromHMM_mESC_mm10

# Plot
pdf("test.pdf")
track_plot(summary_list = t, 
          col = track_cols, 
          show_ideogram = TRUE, 
          peaks = peak_annotation, 
          chromHMM = chromHMM_peaks,
          y_max = 15)
dev.off()

the chromHMM track is still colored by the hg19/38 chromHMM definition, not the one defined innately from my chromHMM file.

I know that in your "H1_chromHMM.bed", the 4th column specifies the types of chromHMM feature.

head -n 5 H1_chromHMM.bed

chr6 31125621 31126021 1
chr6 31126021 31127821 2
chr6 31127821 31128221 6
chr6 31128221 31129421 11
chr6 31129421 31129621 7

And although my mESC_E14_12_dense.annotated.bed also specified the feature type in the 4th column, the nomenclature is different from yours. And the 9th column contains the RGB code.

head -n 5 mESC_E14_12_dense.annotated.bed

track name="mESC_E14_12" description=" mESC_E14_12State (Emission ordered)" visibility=1 itemRgb="On"
chr10 0 1000 12_LowSignal/RepetitiveElements 0 . 0 1000 255,255,204
chr10 1000 3107400 2_Intergenic 0 . 1000 3107400 0,153,204
chr10 3107400 3108400 1_Insulator 0 . 3107400 3108400 0,0,255
chr10 3108400 3118600 2_Intergenic 0 . 3108400 3118600 0,153,204

I am wondering if there is a way to modify 'track_plot' so that it can recognize the need for redefining colors.

Thanks a lot!

Hi,

You can define your own set of colors for chromHMM states with the chromHMM_cols argument in track_plot()

e.g.

chromhmm_cols  = c("red", "red4", "purple", "orange", "orange", "yellow", "yellow", 
             "blue", "darkgreen", "darkgreen", "lightgreen", "gray", "gray90", 
             "gray90", "gray90")
names(chromhmm_cols) = 1:15 #Rename accordingly

track_plot(..., chromHMM_cols = chromhmm_cols)

I hope this helps.

Hi @PoisonAlien,

Thanks for the help! I have checked some loci and it is re-coloring the chromHMM track correctly.

However, I wonder if track_plot() automatically recognizes the numerical prefix before each feature name, such as the "12" in "12_LowSignal/RepetitiveElements"?

I noticed that your chromHMM BED file only has numbers in the 4th column, which makes perfect sense to use names(chromhmm_cols) = 1:15, but mine contains other characters... just curious to know

Hi,

You need to name the color vector with all the unique values in your 4th column.

chromhmm_cols  = c("red", "red4", "purple", "orange", "orange", "yellow", "yellow", 
             "blue", "darkgreen", "darkgreen", "lightgreen", "gray", "gray90", 
             "gray90", "gray90")

names(chromhmm_cols) = unique(mESC_E14_12_dense.annotated.bed[,4])

Thank you, this works now.

bed_data <- read.table("mm10_chromHMM/mESC_E14_12_dense.annotated.bed", header = FALSE, stringsAsFactors = FALSE, skip = 1)
names(chromhmm_cols) <- unique(bed_data[,4])

unique(bed_data[,4])

[1] "12_LowSignal/RepetitiveElements" "2_Intergenic"
[3] "1_Insulator" "5_RepressedChromatin"
[5] "4_Enhancer" "3_Heterochromatin"
[7] "8_StrongEnhancer" "6_BivalentChromatin"
[9] "7_ActivePromoter" "11_WeakEnhancer"
[11] "10_TranscriptionElongation" "9_TranscriptionTransition"

However, when I supplied the feature names one by one, the final plot did not have the chromHMM track.

names(chromhmm_cols) <- c("1_Insulator", 
                        "2_Intergenic",
                        "3_Heterochromatin",
                        "4_Enhancer",
                        "5_RepressedChromatin",
                        "6_BivalentChromatin",
                        "7_ActivePromoter",
                        "8_StrongEnhancer",
                        "9_TranscriptionTransition",
                        "10_TranscriptionElongation",
                        "11_WeakEnhancer",
                        "12_LowSignal/RepetitiveElements")

pdf(file.path(outputPath, "test.pdf"))
track_plot(summary_list = t, 
          col = track_cols, 
          show_ideogram = TRUE, 
          peaks = peak_annotation, 
          chromHMM = chromHMM_peaks, chromHMM_cols = chromhmm_cols,
          y_min = 0, y_max = 15)
dev.off()

I guess I shouldn't worry about it much now since your suggestions worked, but I was slightly confused when this happened.