Competing Java VM instance in jnius between patapsco/pyserini and pyTerrier
Closed this issue · 5 comments
Both pyserini and pyTerrier use jnius to integrate with their Java code. But jnius does not allow spawning multiple VMs.
So after running patapsco once, when we try to initialize pyTerrier.
import pyterrier as pt
if not pt.started():
pt.init(tqdm='notebook')
We will get
ValueError: VM is already running, can't set classpath/options; VM started at File "/Users/eyang/miniconda3/envs/patapsco/lib/python3.8/runpy.py", line 194, in _run_module_as_main
And vice versa. I'm not that familiar with jnius but are we able to not start a new VM and able to run both?
There is probably a way to hack this. It will requiring adding the path to the jar file to jnius. I'll take a look.
Can you try the branch 23-set-classpath-early
?
You'll need to import pyterrier and call init() before calling run() on Runner. Once Patapsco runs once, the VM is loaded and the pyTerrier jar cannot be added.
Now it starts to complain the VM is running by the time I import patapsco.
Looks like it is caused by the patch of PSQ
import pyterrier as pt
if not pt.started():
pt.init(tqdm='notebook')
import copy
import random
from pathlib import Path
import pandas as pd
import patapsco
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/var/folders/f7/741f9g1x03dfyjzqfpgr_w800000gp/T/ipykernel_89121/3537804280.py in <module>
5 import pandas as pd
6
----> 7 import patapsco
~/Documents/Repositories/patapsco/patapsco/__init__.py in <module>
10
11 # TODO remove
---> 12 from .psq_setup import configure_classpath_psq
~/Documents/Repositories/patapsco/patapsco/psq_setup.py in <module>
33 pyserini.setup.configure_classpath = skip_setting_classpath
34
---> 35 configure_classpath_psq()
~/Documents/Repositories/patapsco/patapsco/psq_setup.py in configure_classpath_psq()
17
18 latest = max(paths, key=os.path.getctime)
---> 19 jnius_config.add_classpath(latest)
20 psq_path = (Path(__file__).parent / 'resources' / 'jars').glob('psq*.jar')
21 if not psq_path:
~/miniconda3/envs/patapsco/lib/python3.8/site-packages/jnius_config.py in add_classpath(*path)
55 Replaces any existing classpath, overriding the CLASSPATH environment variable.
56 """
---> 57 check_vm_running()
58 global classpath
59 if classpath is None:
~/miniconda3/envs/patapsco/lib/python3.8/site-packages/jnius_config.py in check_vm_running()
18 """Raises a ValueError if the VM is already running."""
19 if vm_running:
---> 20 raise ValueError("VM is already running, can't set classpath/options; VM started at" + vm_started_at)
21
22
ValueError: VM is already running, can't set classpath/options; VM started at File "/Users/eyang/miniconda3/envs/patapsco/lib/python3.8/runpy.py", line 194, in _run_module_as_main
Import patapsco first and then call init() on pyterrier.
Cool it works :)