bug: shell file is not mapped into docker volume while using "command provider"
r00t1900 opened this issue · 4 comments
Env
- tunasync: 0.8.0
- tunasynctl: 0.8.0
- tunasync-scripts: master@7817785
- system: debian 10
- arch: amd64
- docker: 20.10.9
- tunathu/bandersnatch: latest
Description
tunasync
can not run custom shell file with proper path:
tunasync worker -c worker.conf -v --debug
:
[22-01-01 11:06:18][DEBUG][runner.go:53] volume: /tmp/tunasync/pypi:/tmp/tunasync/pypi
[22-01-01 11:06:18][DEBUG][runner.go:127] Command start: [docker run --rm -a STDOUT -a STDERR --name tunasync-job-pypi -w /tmp
/tunasync/pypi -u 0:0 -v /tmp/tunasync/log/tunasync/pypi:/tmp/tunasync/log/tunasync/pypi -v /tmp/tunasync/log/tunasync/pypi/py
pi_2022-01-01_11_06.log:/tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_06.log -v /tmp/tunasync/pypi:/tmp/tunasync/pypi -e
TUNASYNC_MIRROR_NAME=pypi -e TUNASYNC_WORKING_DIR=/tmp/tunasync/pypi -e TUNASYNC_UPSTREAM_URL=https://pypi.python.org/ -e TUNA
SYNC_LOG_DIR=/tmp/tunasync/log/tunasync/pypi -e TUNASYNC_LOG_FILE=/tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_06.log tunathu/bandersnatch:latest /home/scripts/pypi.sh]
[22-01-01 11:06:18][DEBUG][cmd_provider.go:145] set isRunning to true: pypi
[22-01-01 11:06:18][DEBUG][base_provider.go:168] calling Wait: pypi
[22-01-01 11:06:18][DEBUG][job.go:169] provider started
[22-01-01 11:06:18][DEBUG][worker.go:469] reporting on manager url: http://localhost:12345/workers/test_worker/schedules
[22-01-01 11:06:18][DEBUG][worker.go:448] reporting on manager url: http://localhost:12345/workers/test_worker/jobs/pypi
[22-01-01 11:06:18][DEBUG][worker.go:469] reporting on manager url: http://localhost:12345/workers/test_worker/schedules
[22-01-01 11:06:18][DEBUG][base_provider.go:165] set isRunning to false: pypi
[22-01-01 11:06:18][DEBUG][job.go:180] syncing done
[22-01-01 11:06:18][WARNIN][job.go:213] failed syncing pypi: exit status 127
[22-01-01 11:06:18][DEBUG][job.go:215] post-fail hooks
/tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_06.log
:
root@tuna-docker-supported:~# cat /tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_20.log.fail
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec
: "/home/scripts/pypi.sh": stat /home/scripts/pypi.sh: no such file or directory: unknown.
time="2022-01-01T11:20:33+08:00" level=error msg="error waiting for container: context canceled"
Analysis
According to these debug information, I noticed that the docker commands did not map pypi.sh
into docker filesystem, which might be the reason of no such file or directory
.
Solution
I try to append -v /home/scripts/pypi.sh:/home/scripts/pypi.sh
to the docker commands and then manually execute it, and it shows that it works well:
docker run --rm -a STDOUT -a STDERR --name tunasync-job-pypi -w /tmp/tunasync/pypi -u 0:0 \
# add this below volume mapping args
-v /home/scripts/pypi.sh:/home/scripts/pypi.sh \
-v /tmp/tunasync/log/tunasync/pypi:/tmp/tunasync/log/tunasync/pypi \
-v /tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_06.log:/tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_06.log \
-v /tmp/tunasync/pypi:/tmp/tunasync/pypi \
-e TUNASYNC_MIRROR_NAME=pypi \
-e TUNASYNC_WORKING_DIR=/tmp/tunasync/pypi \
-e TUNASYNC_UPSTREAM_URL=https://pypi.python.org/ \
-e TUNASYNC_LOG_DIR=/tmp/tunasync/log/tunasync/pypi \
-e TUNASYNC_LOG_FILE=/tmp/tunasync/log/tunasync/pypi/pypi_2022-01-01_11_06.log \
tunathu/bandersnatch:latest /home/scripts/pypi.sh
command output:
Syncing to /tmp/tunasync/pypi
2022-01-01 04:06:26,421 INFO: Selected storage backend: filesystem (configuration.py:128)
2022-01-01 04:06:26,421 INFO: Selected compare method: stat (configuration.py:174)
2022-01-01 04:06:26,740 INFO: Initialized project plugin allowlist_project, filtering ['tf-nightly-cpu'] (allowlist_name.py:31
)
2022-01-01 04:06:26,744 INFO: Initialized project plugin blocklist_project, filtering [] (blocklist_name.py:27)
2022-01-01 04:06:26,800 INFO: Status file /tmp/tunasync/pypi/status missing. Starting over. (mirror.py:601)
2022-01-01 04:06:26,800 INFO: Syncing with https://pypi.python.org/. (mirror.py:56)
2022-01-01 04:06:26,800 INFO: Current mirror serial: 0 (mirror.py:267)
2022-01-01 04:06:26,800 INFO: Syncing all packages. (mirror.py:282)
2022-01-01 04:06:43,845 INFO: Package 'tf-nightly-cpu' is allowlisted (allowlist_name.py:88)
2022-01-01 04:06:43,955 INFO: Trying to reach serial: 12451048 (mirror.py:299)
2022-01-01 04:06:43,955 INFO: 1 packages to sync. (mirror.py:301)
2022-01-01 04:06:43,978 INFO: No metadata filters are enabled. Skipping metadata filtering (mirror.py:75)
2022-01-01 04:06:43,978 INFO: No release filters are enabled. Skipping release filtering (mirror.py:77)
2022-01-01 04:06:43,978 INFO: No release file filters are enabled. Skipping release file filtering (mirror.py:79)
2022-01-01 04:06:43,981 INFO: Fetching metadata for package: tf-nightly-cpu (serial 12447857) (package.py:57)
2022-01-01 04:06:44,648 INFO: Downloading: https://files.pythonhosted.org/packages/46/2a/07af15a0d8ca3f75a53621dab60f92f72d704
6c511dbeeee303cb947b187/tf_nightly_cpu-2.7.0.dev20210701-cp36-cp36m-macosx_10_14_x86_64.whl (mirror.py:933)
Further
- Why we need to manually add this mapping? And how the current mirror web is running? Whether this is a bug or not?
- Can we just change upstream from https://pypi.org to https://pypi.tuna.tsinghua.edu.cn? We try to boost our mirroring speed rate but receive these error:
File "/usr/local/lib/python3.9/site-packages/bandersnatch/master.py", line 216, in rpc
return await method() File "/usr/local/lib/python3.9/site-packages/aiohttp_xmlrpc/client.py", line 121, in __remote_call
return self._parse_response((await response.read()), method_name) File "/usr/local/lib/python3.9/site-packages/aiohttp_xmlrpc/client.py", line 82, in _parse_response
response = etree.fromstring(body, parser)
File "src/lxml/etree.pyx", line 3252, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1912, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1800, in lxml.etree._parseDoc File "src/lxml/parser.pxi", line 1141, in lxml.etree._BaseParser._parseDoc
File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
File "<string>", line 7lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: meta line 5 and head, line 7, column 8
worker.conf
:
[global]
name = "test_worker"
log_dir = "/tmp/tunasync/log/tunasync/{{.Name}}"
mirror_dir = "/tmp/tunasync"
concurrent = 10
interval = 1
[docker]
enable = true
[manager]
api_base = "http://localhost:12345"
token = ""
ca_cert = ""
[cgroup]
enable = false
base_path = "/sys/fs/cgroup"
group = "tunasync"
[server]
hostname = "localhost"
listen_addr = "127.0.0.1"
listen_port = 6000
ssl_cert = ""
ssl_key = ""
[[mirrors]]
name = "pypi"
provider = "command"
upstream = "https://pypi.tuna.tsinghua.edu.cn/"
command = "/home/scripts/pypi.sh"
docker_image = "tunathu/bandersnatch:latest"
interval = 5
manger.conf
:
debug = false
[server]
addr = "127.0.0.1"
port = 12345
ssl_cert = ""
ssl_key = ""
[files]
db_type = "bolt"
db_file = "/tmp/tunasync/manager.db"
ca_cert = ""
/home/scripts/pypi.sh
:
#!/bin/bash
set -e
BANDERSNATCH=${BANDERSNATCH:-"/usr/local/bin/bandersnatch"}
TUNASYNC_UPSTREAM=${TUNASYNC_UPSTREAM_URL:-"https://pypi.tuna.tsinghua.edu.cn/"}
CONF="/tmp/bandersnatch.conf"
INIT=${INIT:-"0"}
if [ ! -d "$TUNASYNC_WORKING_DIR" ]; then
mkdir -p $TUNASYNC_WORKING_DIR
INIT="1"
fi
echo "Syncing to $TUNASYNC_WORKING_DIR"
if [[ $INIT == "0" ]]; then
(
cat << EOF
[mirror]
directory = ${TUNASYNC_WORKING_DIR}
master = ${TUNASYNC_UPSTREAM}
json = true
timeout = 300
workers = 5
hash-index = false
stop-on-error = false
delete-packages = true
compare-method = stat
[plugins]
enabled =
blocklist_project
allowlist_project
[allowlist]
packages =
tf-nightly-cpu
EOF
for i in $PYPI_EXCLUDE; do
echo " $i"
done
) > $CONF
exec $BANDERSNATCH -c $CONF mirror
else
cat > $CONF << EOF
[mirror]
directory = ${TUNASYNC_WORKING_DIR}
master = ${TUNASYNC_UPSTREAM}
json = true
timeout = 15
workers = 10
hash-index = false
stop-on-error = false
delete-packages = false
EOF
exec $BANDERSNATCH -c $CONF mirror
fi
Thanks for viewing.
Your analysis is correct. It is not a bug but a feature, because tunasync does not know how to setup the mapping. Actually, the script configured in command
field is executed in the docker image. It can be directly built into the image or mapped from other location. The mapping can be declared in the [docker]
section so that no repeated separated config is needed. For example:
[docker]
volumes = [
"/path/to/tunasync-scripts:/home/scripts:ro",
]
[[mirrors]]
name = "foo"
provider = "command"
upstream = "xxxxx"
command = "/home/scripts/foo.sh"
docker_image = "foo_image:latest"
docker_volumes = [
"/path/to/additional_volume1:/path/to/mountpoint:ro",
"/path/to/additional_volume2:/path/to/mountpoint2:ro"
]
Bandersnatch relies on xml-rpc interface provided by official pypi.org, and as a result cannot sync pypi repository from an alternative source. However, in its latest release, a new config is added entitled download-mirror
, to fetch package metadata from the rpc interface on pypi.org and actual packages from an alternative source.
Bandersnatch relies on xml-rpc interface provided by official pypi.org, and as a result cannot sync pypi repository from an alternative source. However, in its latest release, a new config is added entitled
download-mirror
, to fetch package metadata from the rpc interface on pypi.org and actual packages from an alternative source.
Thank you for replying. This really help a lot, bravo!
Your analysis is correct. It is not a bug but a feature, because tunasync does not know how to setup the mapping. Actually, the script configured in
command
field is executed in the docker image. It can be directly built into the image or mapped from other location. The mapping can be declared in the[docker]
section so that no repeated separated config is needed. For example:[docker] volumes = [ "/path/to/tunasync-scripts:/home/scripts:ro", ] [[mirrors]] name = "foo" provider = "command" upstream = "xxxxx" command = "/home/scripts/foo.sh" docker_image = "foo_image:latest" docker_volumes = [ "/path/to/additional_volume1:/path/to/mountpoint:ro", "/path/to/additional_volume2:/path/to/mountpoint2:ro" ]
Thank you, your mind and step are both right, problem solved :)