/UNEAK

Calling SNPs using UNEAK (de novo) pipeline for ponderosa pine

Primary LanguageShell

UNEAK

Using UNEAK (de novo) pipelineto call SNPs with GBS raw data of 94 ponderosa pine (Pinus ponderosa)

Software

Input File

  • Raw fasta file produced by GBS (Genotyping by sequencing) with restriction enzyme ApeKI
  • Barcode information

Output File

VCF file

Step 1: create working directory

  • Code: S1_crtDir.sh
  • Plugin: -UCreatWorkingDirPlugin
  • Output: Eight folders

Step 2: fastq to tag count

  • Code: S2_fqtotag.sh
  • Input: barcode file and the raw fasta file
  • Plugin: -UFastqToTagCountPlugin
  • Output: 96 tag counts .cnt file

Step 3: merge tag count

  • Code: S3_mergetag.sh
  • Input: 96 tag counts .cnt file
  • Plugin: -UMergeTaxaTagCountPlugin
  • Minimum count reads: 1
  • Output: 1 merged all tag counts .cnt file

Step 4: merged tag count to tag pairs

  • Code: S4_tagctotagp.sh
  • Input: 1 merged all tag counts .cnt file
  • Plugin: -UTagCountToTagPairPlugin
  • QS: 0.01
  • Output: 1 tag pair .tps file

Step 5: tag pairs to tbt

  • Code: S5_tagptotbt.sh
  • Input: 1 tag pair .tps file, 96 tag counts .cnt file from step 2
  • Plugin: -UTagPairToTBTPlugin
  • Output: a tag by taxa .bin file

Step 6: tbt to map info

  • Code: S6_tbttomapinf.sh
  • Input: 1 tag pair .tps file from step 4, a tag by taxa .bin file from step 5
  • Plugin: -UTBTToMapInfoPlugin
  • Output: a map information .bin file

Step 7: map info to hapmap

  • Code: S7_mapinftohm.sh
  • Input: a map information .bin file
  • mnMAF: 0.01
  • minimum count reads: 1
  • minimum call rate: 0.1
  • maximum call rate: 1
  • Plugin: -UMapInfoToHapMapPlugin
  • Output: a .hmp file, a .hmc file, a .fas file

Step 8: change hapmap file to vcf file

  • Code: hmptovcf_denovo.sh
  • Input: a .hmp file
  • Output: a .vcf file • hmptovcf_denovo.out