Using different TSD lengths as facetting variable
Closed this issue · 1 comments
Hi,
I have a dataset with TSDs of different lengths and wanted to create sequence logo for each length bin. I think it would be very handy to use the ggplot's facet functionality to do this. However, with the current implementation this raises an error for different sequence length.
Error in letterMatrix(seqs) : Sequences in alignment must have identical lengths
I am not sure how difficult it is to implement this behaviour but it would help tremendously to explore heterogeneous datasets, as from TE calling.
Best
Fritjof
This is my data:
head(df)
TSD<chr> TSD_length <int>
1 TAAAAATAAAGTCCT 15
2 AAAAGATTTGTGCAG 15
3 TGGGGGGACATTTTT 15
4 CCATTCTGATTTTTTT 16
5 ACAGGGAAAGGTTTTT 16
6 AAAAAGTGTGCTGGAGG 17
And my ggplot call:
p <- ggplot(df.pass.tsdlength.test)
p + geom_logo(data = df.pass.tsdlength.test$TSD, seq_type = "dna" ) +
theme_logo() +
facet_wrap( ~ TSD_length)
I think the underlying issue is that ggseqlogo doesn't work as you might expect it to.
In fact, it does not even require passing data to ggplot.
Try this as a workaround:
ggseqlogo(with(df,split(TSD,sprintf('TSD_length=%s',TSD_length))))