awohns/unified_genealogy_paper

issue with convert.py

Opened this issue · 2 comments

Hello, I am trying to use your pipeline. It went well until an issue occurred during the convert.py step. It was saying:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "convert.py", line 155, in make_sampledata
with tsinfer.SampleData(
File "/home/admin1/anaconda3/envs/ugenea/lib/python3.8/site-packages/tsinfer/formats.py", line 939, in init
super().init(**kwargs)
File "/home/admin1/anaconda3/envs/ugenea/lib/python3.8/site-packages/tsinfer/formats.py", line 394, in init
store = self._new_lmdb_store(max_file_size)
File "/home/admin1/anaconda3/envs/ugenea/lib/python3.8/site-packages/tsinfer/formats.py", line 470, in _new_lmdb_store
return zarr.LMDBStore(self.path, subdir=False, map_size=map_size)
File "/home/admin1/anaconda3/envs/ugenea/lib/python3.8/site-packages/zarr/storage.py", line 2194, in init
self.db = lmdb.open(path, **kwargs)
lmdb.DiskError: /mnt/d/unified_genealogy_paper/all-data/tgp_chr22.samples1: No space left on device

It was running in a Win10 WSL2 Ubuntu 20.04 device with 32GB memory.

Could you please make some suggestions to help fix this issue.

Thank you very much.

After some debuging efforts, I figured out that the issue lies in the storage.py of the zaar package. The **kwargs is bizarrely not working by defaults. Things worked with manual modification as lmdb.open(path, map_size = 10995116278, subdir = False, lock = True).

Hi sorry for the slow response here. Well done for figuring it out. Have you reported this upstream to zarr?