mikolmogorov/Flye

Flye 2.9.4 fails if running inside WDL due to long Cromwell path

asteindorff opened this issue · 2 comments

Hi,

We have been running Flye v2.9 in our workflows with no issues, but v2.9.4 fails when running inside a WDL because Cromwell creates long paths (works fine running standalone). It seems that something changed in the multiprocessing module. Do you suggest any workaround for this issue?

[2024-06-28 18:22:15] INFO: Starting Flye 2.9.4-b1799
[2024-06-28 18:22:15] INFO: >>>STAGE: configure
[2024-06-28 18:22:15] INFO: Configuring run
[2024-06-28 18:22:15] INFO: Total read length: 57824045
[2024-06-28 18:22:15] INFO: Input genome size: 100000
[2024-06-28 18:22:15] INFO: Estimated coverage: 578
[2024-06-28 18:22:15] INFO: Reads N50/N90: 7019 / 4503
[2024-06-28 18:22:15] INFO: Minimum overlap set to 5000
[2024-06-28 18:22:15] INFO: Using longest 100x reads for contig assembly
[2024-06-28 18:22:15] INFO: >>>STAGE: assembly
[2024-06-28 18:22:15] INFO: Assembling disjointigs
[2024-06-28 18:22:15] INFO: Reading sequences
[2024-06-28 18:22:15] INFO: Building minimizer index
[2024-06-28 18:22:15] INFO: Pre-calculating index storage
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-06-28 18:22:15] INFO: Filling index
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-06-28 18:22:20] INFO: Extending reads
[2024-06-28 18:24:06] INFO: Overlap-based coverage: 183
[2024-06-28 18:24:06] INFO: Median overlap divergence: 0.000220289
0% 70% 100%
[2024-06-28 18:24:07] INFO: Assembled 1 disjointigs
[2024-06-28 18:24:07] INFO: Generating sequence
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2024-06-28 18:24:07] INFO: Filtering contained disjointigs
0% 100%
[2024-06-28 18:24:07] INFO: Contained seqs: 0
[2024-06-28 18:24:07] INFO: >>>STAGE: consensus
[2024-06-28 18:24:07] INFO: Running Minimap2
[2024-06-28 18:24:08] INFO: Computing consensus
Process SyncManager-1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.10/multiprocessing/managers.py", line 591, in _run_server
    server = cls._Server(registry, address, authkey, serializer)
  File "/usr/lib/python3.10/multiprocessing/managers.py", line 156, in __init__
    self.listener = Listener(address=address, backlog=16)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 448, in __init__
    self._listener = SocketListener(address, family, backlog)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 591, in __init__
    self._socket.bind(address)
OSError: AF_UNIX path too long
Traceback (most recent call last):
  File "/usr/local/bin/flye", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/flye/main.py", line 758, in main
    _run(args)
  File "/usr/local/lib/python3.10/dist-packages/flye/main.py", line 495, in _run
    jobs[i].run()
  File "/usr/local/lib/python3.10/dist-packages/flye/main.py", line 284, in run
    consensus_fasta = cons.get_consensus(out_alignment, self.in_contigs,
  File "/usr/local/lib/python3.10/dist-packages/flye/polishing/consensus.py", line 71, in get_consensus
    mp_manager = multiprocessing.Manager()
  File "/usr/lib/python3.10/multiprocessing/context.py", line 57, in Manager
    m.start()
  File "/usr/lib/python3.10/multiprocessing/managers.py", line 566, in start
    self._address = reader.recv()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError

Thanks! That solved the problem.