AUTHOR: Dr Asad Prodhan https://asadprodhan.github.io/
Re-basecalling old Nanopore runs with Dorado.
conda create -n dorado
conda activate dorado
- Install Nextflow
conda install -c bioconda nextflow
- Run the following command to make sure that Nextflow has been installed
nextflow -h
If you see the Nextflow options, then the Nextflow has been installed
- Install Singularity
conda install -c conda-forge singularity
- Run the following command to make sure that Singularity has been installed
singularity -h
pip install pod5
export PATH=$PATH:/path/to/the/pod5/executable
wget https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.6.2-linux-x64.tar.gz
More deatils: https://github.com/nanoporetech/dorado?tab=readme-ov-file
tar -zvxf dorado-0.6.2-linux-x64.tar.gz
nano .bashrc
- Add the following line to the .bashrc profile
export PATH=$PATH:/xx/xx/bin/dorado-0.6.2-linux-x64/bin
pod5 convert fast5 sample01.fast5 --output sample01.pod5
Loop function:
for FILE in ./fast5/*.fast5; do FILENAME=$(basename "$FILE" .fast5); pod5 convert fast5 "$FILE" --output ./pod5/${FILENAME}.pod5; done
** It can create the pod5 directory itself**
https://www.youtube.com/watch?v=wBDU0X1ZCco
dorado basecaller sup --emit-fastq barcode07.pod5 > barcode07.fastq
DNA model and its latest version auto selected when you select the mode-fast,hac,sup
Dorado generates unaligned bam file by default. Many useful information about each read are added in the bam tags, which can be extracted by the Dorado summary command.
If you want to generate fastq file instead of bam, the use the "--emit-fastq" flag
Loop function [DNA model and its latest version auto selected]:
for FILE in ./pod5/*.pod5; do FILENAME=$(basename "$FILE" .pod5); dorado basecaller sup --emit-fastq "$FILE" > ./fastq/${FILENAME}.fastq; done
Loop function [DNA model hand-picked]:
First, download the model of your interest:
dorado download --list
dorado download --model dna_r9.4.1_e8_sup@v3.6
Then run the loop function:
for FILE in ./pod5/*.pod5; do FILENAME=$(basename "$FILE" .pod5); dorado basecaller --recursive dna_r9.4.1_e8_sup@v3.6 --emit-fastq "$FILE" > ./fastq/${FILENAME}.fastq; done
It can't create the fastq directory, need to create one beforehand
dorado summary barcode07.bam > barcode07_seq_info.tsv
Alternatively, you can use the fastqc program
For de-multiplexing, all files need to be concatenated first. Otherwise, the loop function replaces the previous files
dorado demux --kit-name SQK-RBK004 --emit-fastq --output-dir ./barcodes beforeDemuxed.fastq
Loop function to check record numbers in each fastq file
for FILE in ./*.fastq; do grep -c "@" "$FILE"; done
Further reading