Several folks are interested, trying to centralize the info.
conda update -n base -c defaults conda
conda create -n bonito python=3.8 pip
conda activate --stack bonito
pip install --extra-index-url https://download.pytorch.org/whl/cu116 ont-bonito
pip install -r bonito/requirements.txt
Change cu116 to your CUDA version (run nvidia-smi to find it, it’ll be on the top right)
bonito download --models --show
bonito download --models --all
You can cancel it at the training data (I think?). My gpugpu machine has very little hard drive space so I did.
bonito basecaller dna_r9.4.1_e8_sup@v3.3 ~/stonefly/all_fast5/ --batchsize 384 --reference index.mmi --save-ctc --recursive --device "cuda:0" --alignment-threads 16 > basecalled-default-model/basecalls.bam
bonito basecaller dna_r9.4.1_e8_sup@v3.3 ~/stonefly/all_fast5/ --batchsize 384 --reference index.mmi --save-ctc --recursive --device "cuda:0" --alignment-threads 16 | samtools view -S -b - > basecalls.bam
Note
|
It has to be in the order as above, or the ctc will not save! basecaller model filepath THEN options. |
Default batch size is 32 (I’m 90% certain). Best to try and increase it. When it crashes from memory, best to killall bonito from another shell. I can get batchsize of 384 and it cuts a little over half an hour off on my dataset (51Gb of fast5 files). And we can work with that number on the following steps.
This led to this error:
> reading fast5 outputting aligned sam loading model dna_r9.4.1_e8_sup@v3.3 loading reference Exception in thread Thread-27: Traceback (most recent call last): File "/home/josephguhlin/.asdf/installs/python/anaconda3-2021.05/envs/bonito/lib/python3.8/threading.py", line 932, in _ bootstrap_inner self.run() File "/home/josephguhlin/.asdf/installs/python/anaconda3-2021.05/envs/bonito/lib/python3.8/site-packages/bonito/io.py", line 563, in run np.save(os.path.join(output_directory, "chunks.npy"), chunks) File "<__array_function__ internals>", line 5, in save File "/home/josephguhlin/.asdf/installs/python/anaconda3-2021.05/envs/bonito/lib/python3.8/site-packages/numpy/lib/npyio .py", line 525, in save file_ctx = open(file, "wb") FileNotFoundError: [Errno 2] No such file or directory: '/proc/2487219/fd/chunks.npy' > completed reads: 2393454 duration: 1:59:37 samples per second 3.3E+06 > done
The problem was samtools. Bonito tries to be smart about the output directory while using pipes, but this breaks using pipes for anything other than file output.
Update
Crashing from memory or shoddy connection to the server. Copying a subset over and trying again.
This is the type of log I’m getting for training, so a very slight improvement. Now trying without using a pretrained model.
[366841/366841]: 100%|######################################################| [1:40:59, loss=0.0226]
[epoch 1] directory=./aptera_tuned loss=0.0231 mean_acc=99.336% median_acc=99.344%
[366841/366841]: 100%|######################################################| [1:40:46, loss=0.0213]
[epoch 2] directory=./aptera_tuned loss=0.0224 mean_acc=99.349% median_acc=99.358%
[366841/366841]: 100%|######################################################| [1:40:53, loss=0.0211]
[epoch 3] directory=./aptera_tuned loss=0.0220 mean_acc=99.355% median_acc=99.364%
[366841/366841]: 100%|######################################################| [1:40:50, loss=0.0209]
[epoch 4] directory=./aptera_tuned loss=0.0217 mean_acc=99.359% median_acc=99.369%
[366841/366841]: 100%|######################################################| [1:40:38, loss=0.0207]
[epoch 5] directory=./aptera_tuned loss=0.0215 mean_acc=99.362% median_acc=99.371%
[366841/366841]: 100%|######################################################| [1:40:26, loss=0.0196]
[epoch 6] directory=./aptera_tuned loss=0.0213 mean_acc=99.366% median_acc=99.372%
[366841/366841]: 100%|######################################################| [1:40:39, loss=0.0195]
[epoch 7] directory=./aptera_tuned loss=0.0212 mean_acc=99.368% median_acc=99.376%
[366841/366841]: 100%|######################################################| [1:40:55, loss=0.0186]
[epoch 8] directory=./aptera_tuned loss=0.0211 mean_acc=99.369% median_acc=99.374%
[366841/366841]: 100%|######################################################| [1:40:20, loss=0.0189]
[epoch 9] directory=./aptera_tuned loss=0.0211 mean_acc=99.371% median_acc=99.375%
[366841/366841]: 100%|######################################################| [1:40:05, loss=0.0182]
[epoch 10] directory=./aptera_tuned loss=0.0210 mean_acc=99.373% median_acc=99.379%
[366841/366841]: 100%|######################################################| [1:39:59, loss=0.0182]
[epoch 11] directory=./aptera_tuned loss=0.0210 mean_acc=99.371% median_acc=99.378%
[63540/366841]: 17%|#########8 | [17:20, loss=0.0179]
josephguhlin in joseph-gpugpu in nanopore-basecaller-training on main [?] via 🅒 bonito took 3s
❯ bash evaluate_trained.bash (bonito)
* loading data
[validation set not found: splitting training set]
* loading model 11
* calling
* mean 99.41%
* median 99.42%
* time 1.02
* samples/s 9.78E+06
josephguhlin in joseph-gpugpu in nanopore-basecaller-training on main [?] via 🅒 bonito took 4s
❯ bash evaluate_default.bash (bonito)
* loading data
[validation set not found: splitting training set]
* loading model 1
* calling
* mean 99.19%
* median 99.18%
* time 1.03
* samples/s 9.72E+06
josephguhlin in joseph-gpugpu in nanopore-basecaller-training on main [?] via 🅒 bonito took 4s
So some small improvement