CPC

Chinese Pangenome Consortium (Phase I)

Introduction

Despite over the last two decades the reference human genome sequence has served as the foundation for genetic and biomedical research and applications, there is a broad consensus that no single reference sequence can represent the genomic diversity of global populations.

On one hand, high-quality population-specific and haplotype-resolved genome references are necessary for genetic and medical analysis. On the other hand, there is a clear need to shift from a single reference to a pangenome form that better represents genomic diversity, or allelic variation within and across human populations.

Here, we present the first effort (Phase I) of the Chinese Pangenome Consortium (CPC) with the draft CPC pangenome reference based on 116 high-quality haplotype-resolved assemblies from 58 core samples representing 36 minority Chinese ethnic groups and 6 assemblies of the Han Chinese.

Files

  • CPC pangenome reference

    • The Pangenome References built based on the CPC core samples and that combined with the HPRC samples are freely available from the POG website.
  • Haplotype-resolved assemblies generated by hifiasm (.fasta)

    • Assemblies of 57 samples only using HiFi reads (including low-quality Assemblies of 10 non core samples).

    • Assemblies of another 11 samples using HiFi reads and paid end Hi-C reads.

The above files are available as described in the "Data availability" section of the paper "A pangenome reference of 36 Chinese populations"

Pipeline

The processing flow and details can be obtained from the protocol.

Publication

Gao, Y., Yang, X., Chen, H. et al. A pangenome reference of 36 Chinese populations. Nature (2023). https://doi.org/10.1038/s41586-023-06173-7

Contact

See also

Chinese Pangenome Consortium CPC
Human Population Omics Group HumPOG