Ninjaclasher/dmoj-docker

Connecting to the judge on a second server

Ali-Toosi opened this issue · 4 comments

Hi!
This is an excellent wrapper for the site and I'm surprised that they haven't yet asked you to have this as their official docker image.
I am having an issue though while trying to have judge and site on two different servers. I can successfully connect them but immediately after connection, the judge raises this error:

ERROR 2021-05-30 08:36:21,958 53 monitor Failed to start problem monitor.
Traceback (most recent call last):
  File "/judge/dmoj/monitor.py", line 97, in start
    self._monitor.start()
  File "/usr/local/lib/python3.9/dist-packages/watchdog/observers/api.py", line 256, in start
    emitter.start()
  File "/usr/local/lib/python3.9/dist-packages/watchdog/utils/__init__.py", line 93, in start
    self.on_thread_start()
  File "/usr/local/lib/python3.9/dist-packages/watchdog/observers/inotify.py", line 118, in on_thread_start
    self._inotify = InotifyBuffer(path, self.watch.is_recursive)
  File "/usr/local/lib/python3.9/dist-packages/watchdog/observers/inotify_buffer.py", line 35, in __init__
    self._inotify = Inotify(path, recursive)
  File "/usr/local/lib/python3.9/dist-packages/watchdog/observers/inotify_c.py", line 169, in __init__
    self._add_watch(path, event_mask)
  File "/usr/local/lib/python3.9/dist-packages/watchdog/observers/inotify_c.py", line 385, in _add_watch
    Inotify._raise_error()
  File "/usr/local/lib/python3.9/dist-packages/watchdog/observers/inotify_c.py", line 405, in _raise_error
    raise OSError(err, os.strerror(err))
FileNotFoundError: [Errno 2] No such file or directory
Warning: failed to start problem monitor!

I am not quite sure what is causing this but my guess was that some directories in docker-compose are not binded correctly. My first guess was binding ./problems to /mnt/problems instead of /problems on the bridged service in compose but that didn't help.

What I can figure from the error message is that it's trying to look up /mnt/problems on the site server to update them on the judge server whenever they are changed through UI. So it has a right to complain about it not being available because we have dockerized that part and none of the compose services use that directory. But I'm not sure which one I should change to do so. Would you be able to help with this?

Thank you : )

Hello, thanks for taking interest in this project! The reason this isn't part of the official docker images is that the main DMOJ site is not dockerized, so the docker images won't be actively maintained.

As for your error, it looks like purely a judge issue. The dockerized site shouldn't have anything to do with it. If you're using the official judge images, make sure you're binding the correct directories on the judge server to inside the container (the site is completely separate). If you want to merge the judge and the site problems directories, you'll need to use an external service such as NFS.

If you could post your judge YAML configuration and the command you're using to start up the judge container, that would help with debugging, but right now, I'm 99% sure the issue is that your bind mount on the judge server is wrong.

As I was explaining the configurations and pasting them here one by one I found the issue... rubber ducked 😅 But now that I have you here, have you previously run them on two different servers? What would you say is the easiest way for handling the need for having two copies of the problems on the two servers? I did see your mention to NFS but since I haven't used it before wanted to have your elaborate thoughts before proceeding..

Yes, I've previously ran a judge and the site on different servers. Usually, I'll go for something like wireguard and NFS through the internal network. One thing you'll want to be careful is to set up manually pinging the judges to update the problem list, since mounting with NFS doesn't support inotify (see this script). Alternatively, you could opt for something like Syncthing, in which case you wouldn't need the above script.

Forgot to close this! Thank you for your help. I haven't yet got it running but it's not an issue of this repo anymore. I'll close.