Cache behaviour
thfrkielikone opened this issue · 3 comments
This is not much of a breaking bug, but I'd wanna ask whether this is intended behaviour. If I have opusfilter installed as root and then want to use it as a non-root user. I get the following behaviour:
WARNING:opusfilter.opusfilter:Output directory not specified. Writing files to current directory.
INFO:opusfilter.opusfilter:Running step 1: opus_read
The following files are available for downloading:
Traceback (most recent call last):
File "/usr/local/bin/opusfilter", line 31, in <module>
of.execute_steps(overwrite=args.overwrite, last=args.last)
File "/usr/local/lib/python3.12/site-packages/opusfilter/opusfilter.py", line 224, in execute_steps
self._run_step(step, num + 1, overwrite)
File "/usr/local/lib/python3.12/site-packages/opusfilter/opusfilter.py", line 289, in _run_step
self.step_functions[step['type']](parameters, overwrite=overwrite)
File "/usr/local/lib/python3.12/site-packages/opusfilter/opusfilter.py", line 327, in read_from_opus
opus_reader = OpusRead(
^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/opustools/opus_read.py", line 196, in __init__
moses_names = self.of_handler.open_moses_files()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/opustools/opus_file_handler.py", line 45, in open_moses_files
self.download_files()
File "/usr/local/lib/python3.12/site-packages/opustools/opus_file_handler.py", line 33, in download_files
og = OpusGet(**arguments)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/opustools/opus_get.py", line 40, in __init__
with open(DB_FILE, 'wb') as outfile:
^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.12/site-packages/opustools/opusdata.db'
I can give the .yml but I don't think it is relevant to this. My gripe is that it tries to create the opusdata.db file to a system location. Would something like a dot-directory (~/.local, ~/.cache, ...) be better here? Would that break some use case?
I am running this in a docker+singularity situation where because of the container nature of the thing, installing as root makes sense (but running as root doesn't, because the singularity env doesn't allow it). I understand that the equivalent location when using a venv would neatly be inside the venv, and that is what I am going to try next for my own purposes. Still, would this be neater if the db file would be stored somewhere else?
This is also apparently an issue in OpusTools. I don't know exactly why that database file is opened in write mode, maybe @miau1 can comment on this? Quickly looking at the code, opus_get
seems to have some DB related options, but opus_read
(used by OpusFilter) does not.
If the file needs to be writable, lib doesn't sound a proper place for the file even inside a venv, but a customizable location with a sensible default like ~/.OpusTools/opusdata.db
.
The database is opened in write mode to uncompress it the first time opus_get
is used. But actually the database is needed only when using the --local_db
option, so now the file is uncompressed only the first the --local_db
option is used. Additionally, the default location of the db file is now ~/.OpusTools/opusdata.db
. Both of these changes are in the latest version of OpusTools, also in PyPI
Thank you very much for solving this small gripe.