Bionano_fixdups is a script for removing artificial duplications introduced by Bionano solve scaffolding pipeline. Starting from the identification of negative gaps annotated in the agp file, the script performs alignments between contigs at the 5' and at the 3' flanking regions of negative gaps in scaffolds, trims the overlaps, and produces a trimmed fasta file. The script is experimental, and its development was discontinued after the release of more refined tools as BiSCoT.
- Minimap2
- Samtools
- Jvarkit samextraclips
- R with BioStrings package
Bionano_fixdups.R
Rscript ./Bionano_fixdups.R <scaffolds.fasta> <file.agp> <contigs.fasta>
Note: set the path to Minimap2, Samtools and Samextractlips executables inside the script before running it.
Inputs:
- <scaffolds.fasta>: fasta file with scaffolds produced by Bionano hybrid scaffolding pipeline
- <file.agp>: agp file describing which contig has been included in each scaffold
- <contigs.fasta>: fasta file with contigs cut by Bionano hybrid scaffolding pipeline
Outputs:
- <scaffolds_neg_gaps_fixed.fasta>: fasta file with overlaps between contigs trimmed
- logfile_fix_scaffolds.txt: logfile reporting operations performed on input scaffolds
- fix_scaffolding_temp: directory containing temporary files