/giab_data_indexes

This repository contains data indexes from NIST's Genome in a Bottle project.

giab_data_indexes

This repository contains data indexes from NIST's Genome in a Bottle (GIAB) project. The indexes for sequences and alignments are also under: ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data_indexes/ .


AshkenazimTrio
Sequencing Platform Sequence Index or Alignment Index
Illumina WGS 2x150bp 300X per individual sequence.index.AJtrio_Illumina300X_wgs_07292015 alignment.index.AJtrio_Illumina300X_wgs_novoalign_GRCh37_GRCh38_NHGRI_07282015
Illumina 6KB Matepair sequence.index.AJtrio_Illumina_6kb_matepair_wgs_08032015 alignment.index.AJtrio_Illumina_6kb_matepair_wgs_bwamem_GRCh37_07302015
Illumina WGS 2X250bp sequence.index.AJtrio_Illumina_2x250bps_06012016
alignment.index.AJtrio_Illumina_2x250bps_isaac-align_hg19_06012016
alignment.index.AJtrio_Illumina_2x250bps_novoalign_GRCh37_GRCh38_NHGRI_06062016
Moleculo sequence.index.AJtrio_NIST_Stanford_Moleculo_125bps_08042015
PacBio 70x/30x/30x sequence.index.AJtrio_PacBio_MtSinai_NIST_hdf5_10102018, alignment.index.AJtrio_PacBio_MSSM_blasr_GRCh37_11192015, alignment.index.AJtrio_PacBio_CSHL_bwamem_GRCh37_11192015
Oxford Nanopore sequence.index.AJtrio_HG002_Cornell_Oxford_Nanopore_fasta_fastq_10132015
SOLiD 60x for son sequence.index.AJtrio_HG002_NIST_SOLiD5500W_xsq_09042015 alignment.index.AJtrio_HG002_SOLiD5500W_NIST_LifeScope_GRCh37_12212015
Illumina Whole Exome by Oslo Uni. Hospital alignment.index.AJtrio_OsloUniversityHospital_IlluminaExome_bwamem_GRCh37_11252015
Ion Proton 1000x Exome alignment.index.AJtrio_IonTorrent_exome_TMAP_GRCh37_07292015
10X Genomics alignment.index.AJtrio_10XGenomics_bwamem_GRCh37_08142015
10X Genomics ChromiumGenome alignment.index.AJtrio_10Xgenomics_ChromiumGenome_GRCh37_GRCh38_06202016
CompleteGenomics alignment.index.AJtrio_CompleteGenomics_normal_RMDNA_EvidenceBams_GRCh37_09282015
CompleteGenomics LFR CG LFR raw or alignment data not available, but analysis results available under: ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/CompleteGenomics_newLFR_CGAtools_06122015/
BioNano sequence.index.AJtrio_BioNano_bnx_10012015
alignment.index.AJtrio_BioNano_xmap_cmap_GRC37_10012015

ChineseTrio
Sequencing Platform Sequence Index or Alignment Index
Illumina WGS 2x250bp 300X for son and 2x150bp 100x for parents sequence.index.ChineseTrio_Illumina300X100X100X_wgs_09232015 alignment.index.ChineseTrio_Illumina300X100X_wgs_novoalign_GRCh37_GRCh38_NHGRI_04062016
Illumina 6KB Matepair sequence.index.ChineseTrio_Illumina_6kb_matepair_wgs_09232015
Moleculo sequence.index.ChineseTrio_NIST_Stanford_Moleculo_125bps_09232015
SOLiD 60x for son sequence.index.ChineseTrio_HG005_NIST_SOLiD5500W_xsq_09042015 alignment.index.ChineseTrio_HG005_SOLiD5500W_NIST_LifeScope_GRCh37_12212015
CompleteGenomics alignment.index.ChineseTrio_CompleteGenomics_normal_RMDNA_EvidenceBams_GRCh37_09282015 alignment.index.ChineseTrio_HG005_CompleteGenomics_normal_cellsDNA_EvidenceBams_GRCh37_09282015
Illumina Whole Exome by Oslo Uni. Hospital alignment.index.Chinesetrio_HG005_OsloUniversityHospital_IlluminaExome_bwamem_GRCh37_11252015
Ion Proton 1000x Exome alignment.index.ChineseTrio_HG005_IonTorrent_exome_TMAP_GRCh37_09232015
BioNano for son sequence.index.ChineseTrio_HG005_BioNano_bnx_10012015
alignment.index.ChineseTrio_HG005_BioNano_xmap_cmap_GRC37_10012015
PacBio Sequel for the trio sequence.index.ChineseTrio_NIST_MtSinai_PacBio_Sequel_fasta_09282018

NA12878
Sequencing Platform Sequence Index or Alignment Index
Illumina WGS 2x150bp 300X sequence.index.NA12878_Illumina300X_wgs_09252015
alignment.index.NA12878_HiSeq_downsampled30X_GRCh37_10262015
alignment.index.NA12878_Illumina300X_wgs_novoalign_GRCh37_GRCh38_NHGRI_03082016
Illumina HiSeq Exome sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_fastq_09252015 sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_trimmed_fastq_09252015 alignment.index.NA12878_HiSeq_Exome_Garvan_GRCh37_09252015
Illumina TruSeq Exome alignment.index.NA12878_TruSeq_Exome_Nebraska_GRCh37_09252015
PacBio 40x sequence.index.NA12878_PacBio_MtSinai_NIST_hdf5_08182015
10X Genomics alignment.index.NA12878_10XGenomics_bwamem_GRCh37_08142015 alignment.index.NA12878_10XGenomics_sizeselected_bwamem_GRCh37_03082016
10X Genomics ChromiumGenome alignment.index.NA12878_10Xgenomics_ChromiumGenome_LongRanger2.0_GRCh37_GRCh38_06202016 alignment.index.NA12878_10Xgenomics_ChromiumGenome_LongRanger2.1_GRCh37_GRCh38_09302016
CompleteGenomics alignment.index.NA12878_CompleteGenomics_normal_RMDNA_EvidenceBams_GRCh37_09282015
CompleteGenomics LFR CG LFR raw or alignment data not available, but analysis results available under: ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/analysis/CompleteGenomics_newLFR_CGAtools_06122015/
Ion Proton 1000x Exome alignment.index.NA12878_IonTorrent_exome_TMAP_GRCh37_09252015
NA12878 SOLiD5500W alignment.index.NA12878_SOLiD5500W_NIST_LifeScope_GRCh37_06012016

Please Note:
1. If you want to use raw sequencing data (fastq, fasta, hdf5, xsq, bnx etc) for your analysis, then you can use the sequence.index.* files when you need to download the data.
2. If you want to use aligned data (bam, xmap/cmap etc.) for your analysis, then you can use the alignment.index.* files when you need to download the data.