/BYUReseq

Resequencing of many accessions with BYU

Primary LanguageRGNU General Public License v3.0GPL-3.0

BYUReseq

The two cultivated allopolyploid cotton species, Gossypium hirsutum and G. barbadense, represent a remarkable example of parallel independent domestication, both of which involve dramatic morphological transformations under directional human selection from wild perennial plants to annualized row crops. Utilizing deep resequencing of 643 newly sampled accessions spanning the wild-to-domesticated continuum of both species, combined with existing data, and through inclusion of all other allopolyploid relatives, we resolve species relationships and elucidate multiple aspects of the parallel domestication process. We confirm that wild forms of G. hirsutum and G. barbadense were initially domesticated in the Yucatan Peninsula and NW S. America, respectively, from where they spread under domestication during their 4,000 - 8,000 year history to encompass most of the New World tropics. We present a robust phylogenomic analysis of infraspecific relationships in each species, quantify genetic diversity in both species, and describe genetic bottlenecks associated with domestication and subsequent diffusion. As the two species became sympatric over the last several millennia, pervasive and genome-wide bidirectional introgression occurred, often with striking asymmetries with respect to the two co-resident genomes of these allopolyploids. Genomic diversity scans revealed genome wide regions and genes unknowingly targeted during the domestication process, and additional asymmetries with respect to the two subgenomes. Our new genome-scale understanding of genetic variation provides a comprehensive depiction of the origin, divergence, and adaptation of cotton, and should serve as a rich resource for cotton improvement.

Here, we employed high-coverage whole genome resequencing for 643 accessions of polyploid cotton, and analyzed a total of 1,024 accessions including 795 accessions of G. hirsutum and 201 accessions of G. barbadense, and 28 other tetraploids for the present study (others from (Page et al. 2016; Fang et al. 2017a, 2017b; Wang et al. 2017). Our sampling is distinguished from previous studies by included representatives of all seven (Grover et al. 2015b; Wendel and Grover 2015; Gallagher et al. 2017) tetraploid species but particularly because we focused on extensive sampling of G. barbadense and G. hirsutum populations spanning the wild-to-domesticated continuum. We used these data to address the following questions: (1) How much diversity exists in the wild, landrace, and modern cultivated gene pools of the two most agronomically important cotton species? (2) What inferences can be made about the geographic origin of domestication, and how much of the wild diversity has been captured following the multiple genetic bottlenecks accompanying domestication and improvement? (3) How much of the winnowing of variation has been counteracted by historical, human-mediated interspecific gene flow between the two species, which became sympatric in parts of their indigenous ranges following dispersal from their ancestral homes? In addition, we used these data to detect historically important genomic regions of domestication and of historical interspecific introgression.

This repo hosts methods, parameters, and data files relevant to this project.

Data available: All the reads were deposited in NCBI SRA with project accession no. PRJNA414461 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA414461).

How to cite: Yuan, D., Grover, C. E., Hu, G., Pan, M., Miller, E. R., Conover, J. L., Hunt, S. P., Udall, J. A., Wendel, J. F., Parallel and Intertwining Threads of Domestication in Allopolyploid Cotton. Adv. Sci. 2021, 2003634. https://doi.org/10.1002/advs.202003634