/RUscripts-R9

Scripts for implementing read until and other examples.

Primary LanguagePythonMIT LicenseMIT

RUscripts-R9

This is a fork of the original RUscript (deprecated system that uses Python 2.7 for R7 nanopore data) which we extended to support Python 3.6+, R9.4 nanopore data, and S/BLOW5 file format. Note that we have only extended the offline RUscripts component.

Setting up dependancies and RUscripts-R9

git clone https://github.com/beebdev/RUscripts-R9
cd RUscripts-R9
python3 -m venv env
source env/bin/activate
pip3 install --upgrade pip
pip3 install numpy==1.18.0 pyslow5 biopython==1.69 scikit-learn==0.20.0 scipy==1.4.0 six==1.16.0 Cython
python3 setup.py install

Running offline RUscripts

To run the software-based updated offline RUscripts on the example Covid-19 dataset:

python3 OfflineReadUntil.py -f dataset/fasta/nCoV-2019.reference.fasta -t MN908947.3:10000-15000 -p 4 -m models/r9.4_450bps.nucleotide.6mer.template.model -w dataset/ncov-testset/slow5 -o RUgOUT -L 3000 > result.paf

Testing mapping accuracy

With the venv activated, install UNCALLED as:

pip3 install git+https://github.com/skovaka/UNCALLED.git

Now compare the generated paf output from offline RUscripts agianst a truthset generated by aligining using Minimap2 (available at dataset/ncov-testset/batch_0.paf) by using UNCALLED pafstats.

uncalled pafstats -r  dataset/ncov-testset/batch_0.paf -n 5000 result.paf