ParkinsonLab/MetaPro

ga_merge_fasta.py not on docker image

Closed this issue · 6 comments

MetaPro.py dies with
/mydata/testout/GA_BWA/data/jobs/contigs_0_chocophlan_h3_chunk_bwa_pp not found. kill the pipe. restart this stage
This appears to be due to a missing script:

2021-11-04 19:20:53.449761 GA BWA merge leftover reads contigs_0                                                                         python3: can't open file '/pipeline/Scripts/ga_merge_fasta.py': [Errno 2] No such file or directory
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/pipeline/MetaPro_commands.py", line 118, in create_and_launch
    sp.check_output(["sh", shell_script_full_path])#, stderr = sp.STDOUT)
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['sh', '/mydata/testout/GA_BWA/BWA_pp_contigs_0_chocophlan_h3_chunk.sh']' returned non-zero exit status 2.

And indeed:

>ls /pipeline/Scripts/ga*
/pipeline/Scripts/ga_BLAT_generic_v2.py
/pipeline/Scripts/ga_BWA_generic_v2.py
/pipeline/Scripts/ga_Diamond_generic_v2.py
/pipeline/Scripts/ga_Final_merge_v4.py

that's a goof on my end.
thanks! please pull the latest develop image

I did, but now I get a new error

Traceback (most recent call last):
  File "/pipeline/MetaPro.py", line 2731, in <module>
    main(config_file, pair_1, pair_2, single, contig, output_folder, num_threads, args_pack, tutorial_mode)
  File "/pipeline/MetaPro.py", line 802, in main
    command_list = commands.create_BWA_pp_command_v2(GA_BWA_label, assemble_contigs_label, ref_tag, ref_path, full_sample_path, marker_file)
  File "/pipeline/MetaPro_commands.py", line 2181, in create_BWA_pp_command_v2
    if("chunk" in ref_file):
NameError: name 'ref_file' is not defined

Which is true, ref_file is not passed to this function:

    def create_BWA_pp_command_v2(self, stage_name, dependency_stage_name, ref_tag, ref_path, query_file, marker_file):
        sample_root_name = os.path.basename(query_file)
        sample_root_name = os.path.splitext(sample_root_name)[0]


        #meant to be called on the split-file version.  PP script will not merge gene maps.
        subfolder       = os.path.join(self.Output_Path, stage_name)
        data_folder     = os.path.join(subfolder, "data")
        bwa_folder      = os.path.join(data_folder, "1_bwa")
        split_folder    = os.path.join(data_folder, "0_read_split")
        pp_folder       = os.path.join(data_folder, "2_bwa_pp")
        final_folder    = os.path.join(subfolder, "final_results")
        dep_loc         = os.path.join(self.Output_Path, dependency_stage_name, "final_results")
        jobs_folder     = os.path.join(data_folder, "jobs")

        self.make_folder(subfolder)
        self.make_folder(data_folder)
        self.make_folder(bwa_folder)
        self.make_folder(final_folder)
        self.make_folder(jobs_folder)
        self.make_folder(pp_folder)

        reads_in    = query_file
        bwa_in      = os.path.join(bwa_folder, sample_root_name + "_" + ref_tag + ".sam")
        reads_out = ""
        if("chunk" in ref_file):

whoops, used the wrong varname.
Please try now. (supossed to have been ref_path, to sense if the version of chocophlan is the smaller, whole-version, or the new, bigger, updated version that had to be chunked in order to fit on our hardware)

Same problem. The docker image is an hour old, so I did get the new one.
Can you do an actual test run yourself? Saves me some debugging.

should be ok now. it was 1 locations of the wrong var, to allow for back-compatibility of chocophlan.

patched ages ago.