sequences are mandatory to be capital
Closed this issue · 8 comments
Hi, @GuangchuangYu
I found the package demands the input sequence must be capital. Otherwise, something gets wrong.
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 936
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] igraph_1.2.6 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
[5] purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tidyverse_1.3.1
[9] ggplot2_3.3.5 tibble_3.1.2 seqcombo_1.14.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 lubridate_1.7.10 Biostrings_2.60.1
[4] assertthat_0.2.1 digest_0.6.27 utf8_1.2.1
[7] IRdisplay_1.0 cellranger_1.1.0 R6_2.5.0
[10] GenomeInfoDb_1.28.1 repr_1.1.3 backports_1.2.1
[13] reprex_2.0.0 stats4_4.1.0 evaluate_0.14
[16] httr_1.4.2 pillar_1.6.1 zlibbioc_1.38.0
[19] rlang_0.4.11 readxl_1.3.1 uuid_0.1-4
[22] rstudioapi_0.13 S4Vectors_0.30.0 labeling_0.4.2
[25] RCurl_1.98-1.3 munsell_0.5.0 broom_0.7.8
[28] modelr_0.1.8 compiler_4.1.0 pkgconfig_2.0.3
[31] BiocGenerics_0.38.0 base64enc_0.1-3 htmltools_0.5.1.1
[34] tidyselect_1.1.1 GenomeInfoDbData_1.2.6 IRanges_2.26.0
[37] fansi_0.5.0 crayon_1.4.1 dbplyr_2.1.1
[40] withr_2.4.2 bitops_1.0-7 grid_4.1.0
[43] jsonlite_1.7.2 gtable_0.3.0 lifecycle_1.0.0
[46] DBI_1.1.1 magrittr_2.0.1 scales_1.1.1
[49] cli_3.0.0 stringi_1.6.2 farver_2.1.0
[52] XVector_0.32.0 fs_1.5.0 xml2_1.3.2
[55] ellipsis_0.3.2 rvcheck_0.1.8 generics_0.1.0
[58] vctrs_0.3.8 cowplot_1.1.1 IRkernel_1.2
[61] tools_4.1.0 Cairo_1.5-12.2 glue_1.4.2
[64] hms_1.1.0 parallel_4.1.0 colorspace_2.0-2
[67] BiocManager_1.30.16 rvest_1.0.0 pbdZMQ_0.3-5
[70] haven_2.4.1
Yang
any reproducible example.
Thanks for your reply, Dr @GuangchuangYu
For upper
>KT162029
ATGAGTGATGGAGCAGTTCACCCAAACGGGGGTCACCCTGCTGTCAAAAATGAAAAAGCTACAGGATCTGGGAACGGGTCTGGAGGCGGGGGGGGGGGGGGTTCGGGGGGGGGGGGGATTTCTACGGGTACTTTCAATAATCAAACGGAATTTAAATTTTTGGAAAACGGATGGGTGGAAATCACAGCAAACTCAAGCAAACTTGTACATTTAAATATGCCAAAAAGTGAAAATTATAAAAAAGGGGTTGTAAATAATTTGGATAAAACTGCATTTAACGGAAACATGGCTTTAAATGATACCCATGCACAAATTGTAACACCTGGGTCATTGGTTGATGCAAATGCTTGGGGAGTTTGGTTTAATCCAGGAGATTGGCAACTAATTGTTAATACTATGAGTGAGTTGCATTTAGTTAGTTTTGAACAAGAAATTTTTAATGTTGTTTTAAAGACTGTTTCAGAATCTGCTACTCAGCCACCAACTAAAGTTTATAATAATGATTTAACTGCATCATTGATGGTTGCATTAGATAGTAATAATACTATGCCATTTACTCCAGCAGCTATGAGATCTGAGACATTGGGTTTTTATCCATGGAAACCAACCATACCAACTCCATGGAGATATTATTTTCAATGGGATAGAACATTAATACCATCTCATACTGGAACTAGTGGCACACCAACAAATATATACCATGGTACAGATCCAGATGATGTTCAATTTTATACTATTGAAAATTCTGTGCCAGTACACTTACTAAGAACAGGTGATGAATTTGCTACAGGAACATTTTTTTTTGATTGTAAACCATGCAGACTAACACATACATGGCAAACAAATAGAGCATTGGGCTTACCACCATTTCTAAATTCTTTGCCTCAAGCTGAAGGAGGTACTAACTTTGGTTATATAGGAGTTCAACAAGATAAAAGACGTGGTGTAACTCAAATGGGAAATACAAACATTATTACTGAAGCTACTATTATGAGACCAGCTGAGGTTGGTTATAGTGCACCATATTATTCTTTTGAGGCGTCTACACAAGGGCCATTTAAAACACCTATTGCAGCAGGACGGGGGGGAGCGCAAACAGATGAAAATCAAGCAGCAGATGGTGATCCAAGATATGCATTTGGTAGACAACATGGTCAGAAAACTACCACAACAGGAGAAACACCTGAGAGATTTACATATATAGCACATCAAGATACAGGAAGATATCCAGAAGGAGATTGGATTCAAAATATTAACTTTAACCTTCCTGTAACAAATGATAATGTATTGCTACCAACAGATCCAATTGGAGGTAAAACAGGAATTAACTATACTAATATATTTAATACTTATGGTCCTTTAACTGCATTAAATAATGTACCACCAGTTTATCCAAATGGTCAAATTTGGGATAAAGAATTTGATACTGACTTAAAACCAAGACTTCATGTAAATGCACCATTTGTTTGTCAAAATAATTGTCCCGGTCAATTATTTGTAAAAGTTGCGCCTAATTTAACAAATGAATATGATCCTGATGCATCTGCTAATATGTCAAGAATTGTAACTTACTCAGATTTTTGGTGGAAAGGTAAATTAGTATTTAAAGCTAAACTAAGAGCCTCTCATACTTGGAATCCAATTCAACAAATGAGTATTAATGTAGATAACCAATTTAACTATGTACCAAGTAACATTGGAGGTATGAAAATTGTATATGAGAAATCTCAACTAGCACCTAGAAAATTATAT
>KT162030
ATGAGTGATGGACCATTTCACCCAAACGGGGGTCACCCTGCTGTCAAAAATGAAAAACCTACAGGATCTGGGAACGGGTCTGGAGGCGGGGGGGGGGGGGGTTCGGGGGGTGGGGGGATTTCTACGGGTACTTTCAATAATCAAACGGAATTTAAATTTTTGGAAAACGGATGGGGGGAAATCACAGCAAACTCAACCAAATTTGTACTTTTAAATATGCCAAAACGTGAAAATTATAAAAAAGTGGTTGTAAATAATTTGGATAAAATTGCATTTAACGGAAACATGGCTTTAAATGATCCCCATGCACAAATTGTAACACCTTGGTCATTGGTTGATGCAAATGCTTGGGGAGTTTGGTTTAATCCAGGAGATTGGCAACTAATTGTTAATACTATGAGTGAGTTGCATTTAGTTAGTTTTGAACAAGAAATTTTTAATGTTGTTTTAAAGACTGTTTCAGAATCTGCTACTCAGCCACCAACTAAAGTTTATAATAATGATTTAACTGCATCATTGATGGTTGCATTAGATAGTAATAATACTATGCCATTTACTCCAGCAGCTATGAGATCTGAGACATTGGGTTTTTATCCATGGAAACCAACCATACCAACTCCATGGAGATATTATTTTCAATGGGATAGAACATTAATACCATCTCATACTGGAACTAGTGGCACACCAACAAATATATACCATGGTACAGATCCAGATGATGTTCAATTTTACACTATTGAAAATTCTGTGCCAGTACACTTACTAAGAACAGGTGATGAATTTGCTACAGGAACATTTTATTTTGATTGTAAACCATGTAGACTAACACACACATGGCAAACAAATAGAGCATTGGGCTTACCACCATTTCTAAATTCTTTGCCTCAAGCTGAAGGAGGTACTAACTTTGGTTATATAGGAGTTCAACAAGATAAAAGACGTGGTGTAACTCAAATGGGAAATACAAACATTATTACTGAAGCTACTATTATGAGACCAGCTGAGGTTGGTTATAGTGCACCATATTATTCTTTTGAGGCGTCTACACAAGGGCCATTTAAAACACCTATTGCAGCAGGACGGGGGGGAGCGCAAACAGATGAAAATCAAGCAGCAGATGGTGATCCAAGATATGCATTTGGTAGACAACATGGTCAAAAAACTACCACAACAGGAGAAACACCTGAGAGATTTACATATATAGCACATCAAGATACAGGAAGATATCCAGAAGGAGATTGGATTCAAAATATTAACTTTAACCTTCCTGTAACAAATGATAATGTATTGCTACCAACAGATCCAATTGGAGGTAAAGCAGGAATTAACTATACTAATATATTTAATACTTATGGTCCTTTAACTGCATTAAATAATGTACCACCAGTTTATCCAAATGGTCAAATTTGGGATAAAGAATTTGATACTGACTTAAAACCAAGACTTCATGTAAATGCACCATTTGTTTGTCAAAATAATTGTCCTGGTCAATTATTTGTAAAAGTTGCGCCTAATTTAACAAATGAATATGATCCTGATGCATCTGCTAATATGTCAAGAATTGTAACTTACTCAGATTTTTGGTGGAAAGGTAAATTAGTATTTAAAGCTAAACTAAGAGCCTCTCATACTTGGAATCCAATTCAACAAATGAGTATTAATGTAGATAACCAATTTAACTATGTACCAAGTAATATTGGAGGTATGAAAATTGTATATGAAAAATCTCAACTAGCACCTAGAAAATTATAC
rm(list=ls())
library(seqcombo)
y <- seqdiff("KT162030.fas", reference=1)
py <- plot(y)
py
For lower
>kt162029
atgagtgatggagcagttcacccaaacgggggtcaccctgctgtcaaaaatgaaaaagctacaggatctgggaacgggtctggaggcggggggggggggggttcgggggggggggggatttctacgggtactttcaataatcaaacggaatttaaatttttggaaaacggatgggtggaaatcacagcaaactcaagcaaacttgtacatttaaatatgccaaaaagtgaaaattataaaaaaggggttgtaaataatttggataaaactgcatttaacggaaacatggctttaaatgatacccatgcacaaattgtaacacctgggtcattggttgatgcaaatgcttggggagtttggtttaatccaggagattggcaactaattgttaatactatgagtgagttgcatttagttagttttgaacaagaaatttttaatgttgttttaaagactgtttcagaatctgctactcagccaccaactaaagtttataataatgatttaactgcatcattgatggttgcattagatagtaataatactatgccatttactccagcagctatgagatctgagacattgggtttttatccatggaaaccaaccataccaactccatggagatattattttcaatgggatagaacattaataccatctcatactggaactagtggcacaccaacaaatatataccatggtacagatccagatgatgttcaattttatactattgaaaattctgtgccagtacacttactaagaacaggtgatgaatttgctacaggaacatttttttttgattgtaaaccatgcagactaacacatacatggcaaacaaatagagcattgggcttaccaccatttctaaattctttgcctcaagctgaaggaggtactaactttggttatataggagttcaacaagataaaagacgtggtgtaactcaaatgggaaatacaaacattattactgaagctactattatgagaccagctgaggttggttatagtgcaccatattattcttttgaggcgtctacacaagggccatttaaaacacctattgcagcaggacgggggggagcgcaaacagatgaaaatcaagcagcagatggtgatccaagatatgcatttggtagacaacatggtcagaaaactaccacaacaggagaaacacctgagagatttacatatatagcacatcaagatacaggaagatatccagaaggagattggattcaaaatattaactttaaccttcctgtaacaaatgataatgtattgctaccaacagatccaattggaggtaaaacaggaattaactatactaatatatttaatacttatggtcctttaactgcattaaataatgtaccaccagtttatccaaatggtcaaatttgggataaagaatttgatactgacttaaaaccaagacttcatgtaaatgcaccatttgtttgtcaaaataattgtcccggtcaattatttgtaaaagttgcgcctaatttaacaaatgaatatgatcctgatgcatctgctaatatgtcaagaattgtaacttactcagatttttggtggaaaggtaaattagtatttaaagctaaactaagagcctctcatacttggaatccaattcaacaaatgagtattaatgtagataaccaatttaactatgtaccaagtaacattggaggtatgaaaattgtatatgagaaatctcaactagcacctagaaaattatat
>kt162030
atgagtgatggaccatttcacccaaacgggggtcaccctgctgtcaaaaatgaaaaacctacaggatctgggaacgggtctggaggcggggggggggggggttcggggggtggggggatttctacgggtactttcaataatcaaacggaatttaaatttttggaaaacggatggggggaaatcacagcaaactcaaccaaatttgtacttttaaatatgccaaaacgtgaaaattataaaaaagtggttgtaaataatttggataaaattgcatttaacggaaacatggctttaaatgatccccatgcacaaattgtaacaccttggtcattggttgatgcaaatgcttggggagtttggtttaatccaggagattggcaactaattgttaatactatgagtgagttgcatttagttagttttgaacaagaaatttttaatgttgttttaaagactgtttcagaatctgctactcagccaccaactaaagtttataataatgatttaactgcatcattgatggttgcattagatagtaataatactatgccatttactccagcagctatgagatctgagacattgggtttttatccatggaaaccaaccataccaactccatggagatattattttcaatgggatagaacattaataccatctcatactggaactagtggcacaccaacaaatatataccatggtacagatccagatgatgttcaattttacactattgaaaattctgtgccagtacacttactaagaacaggtgatgaatttgctacaggaacattttattttgattgtaaaccatgtagactaacacacacatggcaaacaaatagagcattgggcttaccaccatttctaaattctttgcctcaagctgaaggaggtactaactttggttatataggagttcaacaagataaaagacgtggtgtaactcaaatgggaaatacaaacattattactgaagctactattatgagaccagctgaggttggttatagtgcaccatattattcttttgaggcgtctacacaagggccatttaaaacacctattgcagcaggacgggggggagcgcaaacagatgaaaatcaagcagcagatggtgatccaagatatgcatttggtagacaacatggtcaaaaaactaccacaacaggagaaacacctgagagatttacatatatagcacatcaagatacaggaagatatccagaaggagattggattcaaaatattaactttaaccttcctgtaacaaatgataatgtattgctaccaacagatccaattggaggtaaagcaggaattaactatactaatatatttaatacttatggtcctttaactgcattaaataatgtaccaccagtttatccaaatggtcaaatttgggataaagaatttgatactgacttaaaaccaagacttcatgtaaatgcaccatttgtttgtcaaaataattgtcctggtcaattatttgtaaaagttgcgcctaatttaacaaatgaatatgatcctgatgcatctgctaatatgtcaagaattgtaacttactcagatttttggtggaaaggtaaattagtatttaaagctaaactaagagcctctcatacttggaatccaattcaacaaatgagtattaatgtagataaccaatttaactatgtaccaagtaatattggaggtatgaaaattgtatatgaaaaatctcaactagcacctagaaaattatac
z <- seqdiff("KT162030_lower.fas", reference=1)
pz <- plot(z)
pz
Something is missing in the nucleotide position part.
these functions will ultimately go to the ggmsa
package and @nyzhoulang can help to solve this issue.
Thanks. Great to know this.
The plot_difference()
function in method-plot.R can not identify lowercases.
I will fix it after these functions are migrated to ggmsa package.
Hi, @nyzhoulang
Thanks for your reply. Good to know that.
Is that possible to compatible the ambiguous bases?
Yang
Hi, Yang
We fixed the bug and migrated these functions to ggmsa package.
Now, both capital and lower cases can be compatible.
And you need to install the dev ggmsa
by the following block:
if (!requireNamespace("devtools", quietly=TRUE))
install.packages("devtools")
devtools::install_github("YuLab-SMU/ggmsa")
The features and function names in ggmsa
are as same as seqcombo
package:
library(ggmsa)
y <- seqdiff("KT162030.fas", reference=1)
plot(y)
Thanks,
Lang
@nyzhoulang Excellent! Thanks a lot.