mkandziora/PhylUp

example_different_levels problems

josephwb opened this issue · 2 comments

Welp I am having trouble with this one too. Each of the ./data/localblast_*.config files pointed to the same database ("/media/blubb/schmuh/local_blast_db_new/") so I changed them all to point to my own. The error message below warns that I am using the same blast folders across runs, but I think this is okay here since it is the same locus?

$ time python example_different_levels.py 
Workflow runs with 8 threads.
REMEMBER TO UPDATE THE NCBI DATABASES REGULARLY! Looks like you last updated it 4 days ago.
Run the file 'update_databases.py' from the data folder to automatically update. Note, that you are accessing a US government website to do so.
Translate input names to ncbi taxonomy...
Build table with information about sequences and taxa.
Initialize NODES and NAMES!!
Load sequences...
Begin update: 2021-04-09 10:07:09.592309.

Clean the input data: data/tiny_test_example/test.tre, data/tiny_test_example/test.fas.Format mrca: {18794} - SenecioBlast round number 0. Max. rounds set to 10000.
Find new sequences using the BLAST database.
Blast 1 out of 5, subsample size is 1.
Warning: [blastn] Number of threads was reduced to 4 to match the number of available CPUs
Blast 2 out of 5, subsample size is 1.
Warning: [blastn] Number of threads was reduced to 4 to match the number of available CPUs
Blast 3 out of 5, subsample size is 1.
Warning: [blastn] Number of threads was reduced to 4 to match the number of available CPUs
Blast 4 out of 5, subsample size is 1.
Warning: [blastn] Number of threads was reduced to 4 to match the number of available CPUs
Blast 5 out of 5, subsample size is 1.
Warning: [blastn] Number of threads was reduced to 4 to match the number of available CPUs
Length of new seqs before filtering: 52
Length of new seqs after filtering: 0
Workflow runs with 8 threads.
You are using the same blast folder across runs (/home/josephwb/Downloads/PhylUp/data/blast) - be careful. Make sure it is the same locus and that you did not change your blast settings.
REMEMBER TO UPDATE THE NCBI DATABASES REGULARLY! Looks like you last updated it 4 days ago.
Run the file 'update_databases.py' from the data folder to automatically update. Note, that you are accessing a US government website to do so.
Traceback (most recent call last):
  File "example_different_levels.py", line 27, in <module>
    test = phyl_up.PhylogeneticUpdater(id_to_spn, seqaln, mattype, trfn, schema_trf, conf, ignore_acc_list=ignore_acc)
  File "/home/josephwb/Downloads/PhylUp/PhylUp/phyl_up.py", line 53, in __init__
    self.aln = DnaCharacterMatrix.get(path=self.aln_fn, schema=self.aln_schema)
  File "/home/josephwb/.local/lib/python3.6/site-packages/dendropy/datamodel/charmatrixmodel.py", line 606, in get
    return cls._get_from(**kwargs)
  File "/home/josephwb/.local/lib/python3.6/site-packages/dendropy/datamodel/basemodel.py", line 156, in _get_from
    return cls.get_from_path(src=src, schema=schema, **kwargs)
  File "/home/josephwb/.local/lib/python3.6/site-packages/dendropy/datamodel/basemodel.py", line 216, in get_from_path
    with open(src, "r", newline=None) as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'tests/output/different_level/updt_aln.fasta

Any idea what is going on? the file tests/output/different_level/updt_aln.fasta does indeed exist.

Ah, I first thought this would be more tricky. Sorry, while cleaning files, I made a very stupid error. The first round is saving the output to tests/output/test_different_level, while the second round is looking for tests/output/different_level.

I updated the example_different_levels.py to the correct working directories. Could you check? When I run the file, I get 162 sequences (which first need to be blasted), before I could confirm the fix.

Great, thanks.