A public repository of Monkeypox (MPXV) related resources maintained by ITER.
This is the result of a continuous collaborative effort of the following Institutions and Laboratories:
- Servicio de Microbiología, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain.
- Fundación Canaria Instituto de Investigación Sanitaria de Canarias at the Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain.
- Laboratorio de Inmunología Celular y Viral, Unidad de Farmacología, Facultad de Medicina, Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain.
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain.
- Virological post: A draft of the first genome sequence of MPXV virus associated with the multi-country outbreak in May 2022 from the Canary Islands, Spain
- Bioinformatic pipelines
- Code for Illumina short-reads processing
- Code for Nanopore long-reads processing and hybrid de novo assemby
- List of bioinformatic software used in our pipelines
- Useful files for the pipelines
- Sequences
- How to download sequences and metadata from GenBank
- Enrichment of viral DNA by means of Human & Bacterial DNA depletion
- Other useful repositories with resources to study MPXV
- References
- Acknowledgements
- License and Attribution
- Participating
- How to cite this work
- Update logs
A technical post with the draft of the first genome sequence of MPXV virus associated with the multi-country outbreak in May 2022 from the Canary Islands, Spain has been shared in Virological. Keep reading! Here.
The first genome sequence of MPXV virus described by us in Virological is phylogenetically related to the multiple viral genomes deposited in NCBI GenBank that correspond to the actual 2022 worldwide outbreak, as shown in Figure 1.
Figure 1. A phylogenetic tree depicting the draft MPXV sequence isolated on May 31, 2022 from a patient from the Canary Islands along with NCBI GenBank publicly available sequences computed by a Nextstrain-monkeypox local instance.
The following diagram (Figure 2) represents a full pipeline used to derive the consensus FASTA sequence of MPXV virus using and combining short- and long-reads technologies (Illumina and Nanopore, respectively).
In the upper part of the diagram, there is a typical pipeline to process short-reads, from the basecalling to the final consensus FASTA sequence, and downstream analysis such as the phylogenetic inference.
In the lower part of the diagram, it is shown a typical pipeline to process long-reads. In addition, it shows how to perform a hybrid de novo assembly combining short- and long-reads.
Two consensus MPXV sequences have been obtained and deposited in NCBI GenBank following the described pipeline:
- A FASTA sequence derived from the pipeline based on mapping of Illumina short-reads against a MPXV reference genome.
- A FASTA sequence resulting from the consensus of the hybrid *de novo* assembly and a MPXV reference genome to complete uncovered regions.
Figure 2. Full bioinformatic pipeline to obtain the MPXV sequences and to infer phylogenetic relationships with other MPXV viral genomes available obtained from public repositories.
Code for Illumina short-reads processing
See a detailed pipeline with examples of command usage for Illumina short-reads.
Code for Nanopore long-reads processing and hybrid de novo assemby
See a detailed pipeline with examples of command usage for Oxford Nanopore Technology long-reads.
List of bioinformatic software used in our pipelines
- Conda manual for installation of numerous open-source tools used in these pipelines:Conda documentation
- Reformat FASTQ files to get an interleaved FASTQ file: BBMap tools v.38.96
- Remove Human mapping-reads from your FASTQ files: NCBI SRA Human Scrubber v.1.0.2021_05_05
- Remove Human mapping-reads from your FASTQ files: Kraken2 v.2.1.2. If you have issues when downloading the database indexes, try this alternative site from BenLangmead.
- Programming environment of general purpose: R v.4.1.3
- Compute the depth of coverage and other statistics: Mosdepth v.0.3.3
- Compute de number of duplicates and other statistics: Picard Tools v.2.18.7
- Perform the variant calling and consensus: iVar v.1.3.1
- Perform the variant calling: LoFreq v.2.1.5
- Get mapping statistics, manipulate BAM files, and generate mpileups for FASTA consensus: SAMtools v.1.6
- Multiple Sample Alignment: MAFFT v.7.505]
- Phylogenomic inference and tree computing: IQ-TREE v.2.2.0.3
- Mapping of short-reads: Minimap2 v.2.24-r1122
- Mapping of short-reads: Bowtie2 v.2.4.5
- Mapping of short-reads: BWA v.0.7.17-r1188
- Framework for analyses and visualization of pathogen genome data (Nextstraing-monkeypox in this case): Nextstrain
- Assembly: Unicycler v.0.5.0
- Benchmarking and quality control of assemblies: QUAST v.5.0.2
- Visualization of assemblies: Bandage v.0.9.0
- Visualization of Kraken 2 reports: Pavian v.1.0
- Annotation of genomes: SnpEff v.5.1d
- Visualization of phylogenetic trees: Figtree
- Visualization of phylogenetic trees: ggtree 3.15
Useful files for the pipelines
- FASTA file ('multiMPXV01.fasta.zip') with multiple sequences of MPXV from NCBI GenBank to use used in the Multiple Sample Aligment step with MAFFT or Nextstrain-monkeypox, available here (last update: June 16, 2022)
- Metadata file (TSV format) to use with a Nextstrain-monkeypox local instance, available here (last update: June 23, 2022)
Consensus FASTA file obtained from a hybrid de novo Illumina-Nanopore based assembly and MT903344.1. See Virological post for more details
NCBI GenBank Accession: ON782054. Sequence available as MPXV/Spain/HUNSC_ITER_0001a/2022
Consensus FASTA file obtained from Illumina short-reads mapping to MT903344.1. See Virological post for more details
NCBI GenBank Accession: ON782055. Sequence available as MPXV/Spain/HUNSC_ITER_0001b/2022
For the published paper (see below), we have sequenced more samples. Their sequences are publicly available as follows:
- MPXV sequences of MPXV01, MPXV05, MPXV06 and MPXV07 samples obtained from the Illumina-only consensus approach have been released in the NCBI GenBank with accessions ON782054, OQ581847, OQ581848, and OQ581849, respectively.
- Hybrid de novo assemblies of MPXV01, MPXV05, MPXV06 and MPXV07 samples have also been released with accessions ON782055, OQ581850, OQ581851, and OQ581852, respectively.
Manual download
- Browse to GenBank.
- Select 'Nucleotide' from the combo box.
- Fill in the accession code of the sequence you want to download (i.e. ON782054) or just write the name of the species (i.e. Monkeypox, and then clic on a certain accession code you are interested in).
- Click on 'FASTA' link
- Click on 'Send to' on the upper right part of the screen.
- Select the option 'file'.
- Select 'FASTA' as download format.
- Click on 'Generate' button.
Programmatically download
We provide a full Python code to retrieve all sequences larger than 190,000 bases from GenBank as example. See the code.
- NEBNext® Microbiome DNA Enrichment Kit. See this post at Virological.org
- See "A new and efficient enrichment method for metagenomic sequencing of monkeypox virus", which performs host DNA depletion using a saponin/NaCl combination treatment and DNase.
Kudos to all research teams behind the scenes in all these repositories:
- WHO Laboratory testing for the monkeypox virus: Interim guidance (and Corrigendums), 23 May 2022
- European Centre for Disease Prevention and Control (ECDC), Monkeypox resource center
- Joint ECDC-WHO Regional Office for Europe Monkeypox Surveillance Bulletin
- Virological.org posts on MPXV
- CADDE-CENTRE GitHub repository for MPXV
- Mike Honey GitHub repository for MPXV
- Mpox-Spectrum from the Computational Evolution group of ETH Zürich in Switzerland
- Global.health, geospatial data visualisations to explore MPXV GitHub repository and visualization
- Our World in Data for MPXV
- Nextstraing build for MPXV in GitHub and visualization
- FIND offers a searchable directory of monkeypox tests
Published papers
Alcolea-Medina, Adela and Charalampous, Themoula and Snell, Luke B. and Aydin, Alp and Alder, Christopher and Maloney, Gillian and Bryan, Lisa and Nebbia, Gaia and Douthwaite, Sam and Neil, Stuart and Cliff, Penelope and O'Grady, Justin and Batra, Rahul and Wilks, Mark and O'Hara, Geraldine and Edgeworth, Jonathan, Novel, Rapid Metagenomic Method to Detect Emerging Viral Pathogens Applied to Human Monkeypox Infections (June 9, 2022). Available at SSRN: https://ssrn.com/abstract=4132526 or http://dx.doi.org/10.2139/ssrn.4132526
Berthet N, Descorps-Declère S, Besombes C, et al. Genomic history of human monkey pox infections in the Central African Republic between 2001 and 2018. Sci Rep. 2021;11(1):13085. Published 2021 Jun 22. doi: https://doi.org/10.1038/s41598-021-92315-8
Cervantes-Gracia K, Gramalla-Schmitz A, Weischedel J, Chahwan R. APOBECs orchestrate genomic and epigenomic editing across health and disease. Trends Genet. 2021;37(11):1028-1043. doi: https://doi.org/10.1016/j.tig.2021.07.003
Cohen J. Global outbreak puts spotlight on neglected virus. Science. 2022;376(6597):1032-1033. doi:10.1126/science.add2701
Cohen-Gihon I, Israeli O, Shifman O, et al. Identification and Whole-Genome Sequencing of a Monkeypox Virus Strain Isolated in Israel. Microbiol Resour Announc. 2020;9(10):e01524-19. Published 2020 Mar 5. doi: https://doi.org/10.1128/MRA.01524-19
Erez N, Achdout H, Milrot E, et al. Diagnosis of Imported Monkeypox, Israel, 2018. Emerg Infect Dis. 2019;25(5):980-983. doi:10.3201/eid2505.190076
Faye O, Pratt CB, Faye M, et al. Genomic characterisation of human monkeypox virus in Nigeria [published correction appears in Lancet Infect Dis. 2018 Mar;18(3):244]. Lancet Infect Dis. 2018;18(3):246. doi: https://doi.org/10.1016/S1473-3099(18)30043-4
Iizuka I, Saijo M, Shiota T, et al. Loop-mediated isothermal amplification-based diagnostic assay for monkeypox virus infections. J Med Virol. 2009;81(6):1102-1108. doi: https://doi.org/10.1002/jmv.21494
Kraemer MUG, Tegally H, Pigott DM, et al. Tracking the 2022 monkeypox outbreak with epidemiological data in real-time [published online ahead of print, 2022 Jun 8]. Lancet Infect Dis. 2022;S1473-3099(22)00359-0. doi: https://doi.org/10.1016/S1473-3099(22)00359-0
Kugelman JR, Johnston SC, Mulembakani PM, et al. Genomic variability of monkeypox virus among humans, Democratic Republic of the Congo. Emerg Infect Dis. 2014;20(2):232-239. doi: https://doi.org/10.3201/eid2002.130118
Kulesh DA, Loveless BM, Norwood D, et al. Monkeypox virus detection in rodents using real-time 3'-minor groove binder TaqMan assays on the Roche LightCycler. Lab Invest. 2004;84(9):1200-1208. doi: https://doi.org/10.1038/labinvest.3700143
Li D, Wilkins K, McCollum AM, et al. Evaluation of the GeneXpert for Human Monkeypox Diagnosis. Am J Trop Med Hyg. 2017;96(2):405-410. doi:10.4269/ajtmh.16-0567
Li Y, Olson VA, Laue T, Laker MT, Damon IK. Detection of monkeypox virus with real-time PCR assays. J Clin Virol. 2006;36(3):194-203. doi: https://doi.org/10.1016/j.jcv.2006.03.012
Li Y, Zhao H, Wilkins K, Hughes C, Damon IK. Real-time PCR assays for the specific detection of monkeypox virus West African and Congo Basin strain DNA. J Virol Methods. 2010;169(1):223-227. doi: https://doi.org/10.1016/j.jviromet.2010.07.012
Luciani L, Inchauste L, Ferraris O, et al. A novel and sensitive real-time PCR system for universal detection of poxviruses [published correction appears in Sci Rep. 2022 Apr 8;12(1):5961]. Sci Rep. 2021;11(1):1798. Published 2021 Jan 19. doi: https://doi.org/10.1038/s41598-021-81376-4
Maksyutov RA, Gavrilova EV, Shchelkunov SN. Species-specific differentiation of variola, monkeypox, and varicella-zoster viruses by multiplex real-time PCR assay. J Virol Methods. 2016;236:215-220. doi: https://doi.org/10.1016/j.jviromet.2016.07.024
Mucker EM, Hartmann C, Hering D, et al. Validation of a pan-orthopox real-time PCR assay for the detection and quantification of viral genomes from nonhuman primate blood. Virol J. 2017;14(1):210. Published 2017 Nov 3. doi: https://doi.org/10.1186/s12985-017-0880-8
Patrono LV, Pléh K, Samuni L, et al. Monkeypox virus emergence in wild chimpanzees reveals distinct clinical outcomes and viral diversity. Nat Microbiol. 2020;5(7):955-965. doi: https://doi.org/10.1038/s41564-020-0706-0
Pecori R, Di Giorgio S, Paulo Lorenzo J, Nina Papavasiliou F. Functions and consequences of AID/APOBEC-mediated DNA and RNA deamination [published online ahead of print, 2022 Mar 7]. Nat Rev Genet. 2022;1-14. doi: https://doi.org/10.1038/s41576-022-00459-8
Seang, S., Burrel, S., Todesco, E., Leducq, V., Monsel, G., Le Pluart, D., Cordevant, C., Pourcher, V., & Palich, R. (2022). Evidence of human-to-dog transmission of monkeypox virus. Lancet (London, England), S0140-6736(22)01487-8. Advance online publication. doi: https://doi.org/10.1016/S0140-6736(22)01487-8
Shchelkunov SN, Shcherbakov DN, Maksyutov RA, Gavrilova EV. Species-specific identification of variola, monkeypox, cowpox, and vaccinia viruses by multiplex real-time PCR assay. J Virol Methods. 2011;175(2):163-169. doi: https://doi.org/10.1016/j.jviromet.2011.05.002
Tumewu J, Wardiana M, Ervianty E, et al. An adult patient with suspected of monkeypox infection differential diagnosed to chickenpox. Infect Dis Rep. 2020;12(Suppl 1):8724. Published 2020 Jul 6. doi: https://doi.org/10.4081/idr.2020.8724
Yong SEF, Ng OT, Ho ZJM, et al. Imported Monkeypox, Singapore. Emerg Infect Dis. 2020;26(8):1826-1830. doi:10.3201/eid2608.191387
Zhao K, Wohlhueter RM, Li Y. Finishing monkeypox genomes from short reads: assembly analysis and a neural network method. BMC Genomics. 2016;17 Suppl 5(Suppl 5):497. Published 2016 Aug 31. doi: https://doi.org/10.1186/s12864-016-2826-8
Preprint papers
A new and efficient enrichment method for metagenomic sequencing of monkeypox virus Pablo Aja-Macaya, Soraya Rumbo-Feal, Margarita Poza, Angelina Cañizares, Juan A. Vallejo and Germán Bou medRxiv 2022.07.29.22278145v1; doi: https://doi.org/10.1101/2022.07.29.22278145
Enhanced surveillance of monkeypox in Bas-Uélé, Democratic Republic of Congo: the limitations of symptom-based case definitions Gaspard Mande, Innocent Akonda, Anja De Weggheleire, Isabel Brosius, Laurens Liesenborghs, Emmanuel Bottieau, Noam Ross, Guy -Crispin Gembu, Robert Colebunders, Erik Verheyen, Ngonda Dauly, Herwig Leirs, Anne Laudisoit medRxiv 2022.06.03.22275815; doi: https://doi.org/10.1101/2022.06.03.22275815
This study has been funded by Cabildo Insular de Tenerife (CGIEU0000219140 and "Apuestas científicas del ITER para colaborar en la lucha contra la COVID-19"), Instituto de Salud Carlos III (FI18/00230) cofunded by European Union (ERDF) "A way of making Europe", and by the agreement with Instituto Tecnológico y de Energías Renovables (ITER) to strengthen scientific and technological education, training, research, development and innovation in Genomics, Personalized Medicine and Biotechnology (OA17/008).
We acknowledge in Table 1 (EXCEL file) the researchers and their institutions who released the MPXV sequences through NCBI GenBank that are being used in our studies.
We also thank the authors, the laboratories that originated and submitted the genetic sequences and the metadata for sharing their work, as shown on Nextstrain, and:
- Hadfield et al, Nextstrain: real-time tracking of pathogen evolution, Bioinformatics (2018).
- Sagulenko et al, TreeTime: Maximum-likelihood phylodynamic analysis, Virus Evolution (2017).
We would like to acknowledge the contributions of several researchers and laboratories who share their preliminary results through the Virological website.
This repository and data exports are released under the CC BY 4.0 license. Please acknowledge the authors, the originating and submitting laboratories for the genetic sequences and metadata, and the open source software used in this work (third-party copyrights and licenses may apply).
Please cite this repository as: "Monkeypox repository of the Reference Laboratory for Epidemiological Surveillance of Pathogens in the Canary Islands (accessed on YYYY-MM-DD)". And do not forget to cite the paper (see the section "How to cite" below).
Want to share your relevant links? Place a Direct Message to @labcflores, @adrmunozb or @resocios on Twitter (see below).
By AMB @adrmunozb and JMLS @resocios
Follow us on Twitter @labcflores
This work has been publised in Computational and Structural Biotechnology Journal. Please, cite it as:
Muñoz-Barrera A, Ciuffreda L, Alcoba-Florez J, et al. Bioinformatic approaches to draft the viral genome sequence of Canary Islands cases related to the multicountry mpox virus 2022-outbreak. Comput Struct Biotechnol J. 2023;21:2197-2203. doi:10.1016/j.csbj.2023.03.020
March 28, 2023. Added a "How to cite" section and reference to our paper publised in Computational and Structural Biotechnology Journal.
September 15, 2022. Added an alternative site from BenLangmead to download indexes for Kraken 2, KrakenUniq, and Bracken (see software section for Kraken2).
August 22, 2022. Added a section with references to enrichment of viral DNA by means of Human & Bacterial DNA depletion; references section updated with new entries; public repositories showing new resources.
June 28, 2022. Added the code to illustrate How-to-download seqs and metadata from GenBank.
June 23, 2022. Added a metadata file to use with a Nextstrain-monkeypox local instance in the useful-files section; bioinformatic codes completed.
June 21, 2022. Added a section with other useful external repositories for MPXV.
June 16, 2022. The MultiSample FASTA file now holds 137 public MPXV sequences.
June 13, 2022. Created the public version of this repository. Enjoy the reading! ;=)