Weeks-UNC/shapemapper2

Component "Interleaver" (sample:Modified) failed

Opened this issue · 5 comments

To whom this may concern,

I ran the "run_example.sh" test and got the following error message on a Linux cluster:

ERROR: Component "Interleaver" (sample:Modified) failed, giving the following error message:==========================================================================================
/home/li.li-umw/Programs/shapemapper2/internals/python/pyshapemap/../../bin/interleave_fastq.py:7: DeprecationWarning: 'U' mode is deprecated
f = open(filename, "rU")
Traceback (most recent call last):
File "/home/li.li-umw/Programs/shapemapper2/internals/python/pyshapemap/../../bin/interleave_fastq.py", line 21, in
o.write('\n'.join(r1+r2) + '\n')
BrokenPipeError: [Errno 32] Broken pipe

Could you help me with this? Thank you!

Best,
Li

I also met this problem.
Before this error log, the "QualityTrimmer" generate empty *_output.fastq.gz".

Hello,

I was able to recreate this specific error locally. I determined that there was not enough space in the virtual memory to initialize the Java VM. This was due to a stringent virtual memory limit on the head node of our linux cluster.

When the run_example.sh script was submitted as a job to a node without a virtual memory limit it ran without issue.

In order to confirm that the error is arising from a deficit in memory you can append --serial to the final line of run_example.sh - IE:
"--denatured --folder example_data/TPPdenat"
will be edited to
"--denatured --folder example_data/TPPdenat --serial"

Then run this updated .sh file:
./run_example.sh

Then run
cat shapemapper_temp/example-results/Modified/Merger/*.fastq

If you see the phrase "Could not reserve enough space for ... object heap"

that is an indication that there isn't enough virtual memory for shapemapper to run properly.

This can be further investigated by running:
ulimit -a | grep "virtual memory"

If the row ends in a number instead of 'unlimited' there is a virtual memory restriction imposed.

If you confirm that there is a virtual memory limit that is preventing shapemapper from running, you can either remove this limit (ulimit -v unlimited) or submit it as a job to a node without a virtual memory restriction (if running on a cluster).

That being said if it is determined that this is not due to an issue with the virtual memory, and is instead a separate issue, feel free to follow up in this thread.

Thank you,
Lucas

Hi Lucaskearns,

Thanks for your response.
I was not running Shapemapper2 in a VM, but rather on a server with ≥ 200GB of memory. I did not limit the memory usage.

Additionally, I ran Shapemapper2 directly on my samples, and the error occurred not with specific samples, but rather occasionally.

Hello again,

I see. Thanks for your clarification.

I apologize for not being clear when I drafted my initial response.

By java VM I am referring to the Java virtual machine:
https://en.wikipedia.org/wiki/Java_virtual_machine

This is not a VM you run shapemapper within, but rather something the Java programming language uses to run its' bytecode.

Essentially one of the dependencies shapemapper uses is written in Java. When running this dependency Java initializes the JVM regardless of whether you are using a VM. This JVM reserves a ton of virtual memory.

Virtual memory is distinct from the traditional notion of RAM / memory:
https://en.wikipedia.org/wiki/Virtual_memory

Historically when I have observed this error message it has been due to a restriction on the virtual memory, thus preventing the JVM from being created / the Java language from running the Java - based component of shapemapper.

That being said, it is possible this is a separate issue that just so happens to be producing the same error message.

However, the fact this error does not occur with specific samples but rather occasionally further indicates to me that this is likely a server - architecture related issue. Shapemapper should be running approximately the same way each time it's run as it is not a stochastic algorithm. Consequently, for it to alternatively fail and succeed when running the same samples makes me think that it's fluctuations in the computing environment (your server) that is the root of the error. I would recommend speaking with your sys admin when one of these jobs fails and seeing if they can give you an indication of whether anything went wrong from a resource / server perspective.

All that being said, if I determine anything more about the nature of this bug, I will be sure to update this thread.

Best,
Lucas

Lucas, thank you for taking a look into this. I found that if I used the latest version of SHAPEMapper (v2.2.0) and specified a memory of 16G of my node, then the problem disappeared. I can now run SHAPEMapper2 smoothly.

Best,
Li