test_ssh_copy failure on riscv64 and ppc64el
nileshpatra opened this issue · 4 comments
Describe the bug
test_ssh_copy fails on ppc and riscv archs
To Reproduce
Simply run kitty test suite on those archs.
Screenshots
Log:
======================================================================
ERROR: test_ssh_copy (kitty_tests.ssh.SSHKitten.test_ssh_copy) (sh='python3')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/<<PKGBUILDDIR>>/kitty/launcher/../../kitty_tests/ssh.py", line 95, in test_ssh_copy
self.check_bootstrap(
conf = 'copy simple-file\ncopy s1\ncopy --symlink-strategy=keep-path s2\ncopy --dest=a/sfa simple-file\ncopy --glob g.*\ncopy --exclude **/w.* --exclude **/r d1\n'
contents = {'.local/share/kitty-ssh-kitten/kitty/version', 'g.2', 's2', 'a/sfa', 'g.1', '.terminfo/kitty.terminfo', 's1', '.local/share/kitty-ssh-kitten/kitty/bin/kitty', 'd1/d2/x', 'd1/y', '.local/share/kitty-ssh-kitten/kitty/bin/kitten', 'simple-file'}
f = <_io.TextIOWrapper name='/tmp/tmp8afyvvrl/s2' mode='r' encoding='utf-8'>
local_home = '/tmp/tmp0lkc_ah_'
remote_home = '/tmp/tmp5zr3e6vj'
self = <kitty_tests.ssh.SSHKitten testMethod=test_ssh_copy>
sh = 'python3'
simple_data = 'rkjlhfwf9whoaa'
tname = '.terminfo'
touch = <function SSHKitten.test_ssh_copy.<locals>.touch at 0x3f822f6480>
w = 's2'
File "/<<PKGBUILDDIR>>/kitty/launcher/../../kitty_tests/ssh.py", line 261, in check_bootstrap
pty.wait_till(check_untar_or_fail, timeout=60)
SHELL_INTEGRATION_VALUE = ''
check_untar_or_fail = <function SSHKitten.check_bootstrap.<locals>.check_untar_or_fail at 0x3f805dbb00>
conf = 'copy simple-file\ncopy s1\ncopy --symlink-strategy=keep-path s2\ncopy --dest=a/sfa simple-file\ncopy --glob g.*\ncopy --exclude **/w.* --exclude **/r d1\n\nshell_integration disabled\ninterpreter python3'
env = {'PATH': '/<<PKGBUILDDIR>>/kitty_tests/kitty/launcher:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games', 'HOME': '/tmp/tmp5zr3e6vj', 'TERM': 'xterm-kitty', 'TERMINFO': '/<<PKGBUILDDIR>>/terminfo', 'KITTY_SHELL_INTEGRATION': 'enabled', 'KITTY_INSTALLATION_DIR': '/<<PKGBUILDDIR>>', 'BASH_SILENCE_DEPRECATION_WARNING': '1', 'PYTHONDONTWRITEBYTECODE': '1', 'WEZTERM_SHELL_SKIP_ALL': '1', 'USER': 'buildd'}
home = '/tmp/tmp0lkc_ah_'
home_dir = '/tmp/tmp5zr3e6vj'
launcher = 'sh'
login_shell = ''
pre_data = ''
pty = <kitty_tests.PTY object at 0x3f820c0290>
self = <kitty_tests.ssh.SSHKitten testMethod=test_ssh_copy>
sh = 'python3'
test_script = 'print("UNTAR_DONE", flush=True); os.execlp("sh", "sh", "-c", \'env; exit 0\')'
File "/<<PKGBUILDDIR>>/kitty/launcher/../../kitty_tests/__init__.py", line 368, in wait_till
raise TimeoutError(f'The condition was not met. Screen contents: \n {repr(self.screen_contents())}')
end_time = 614877.703239922
q = <function SSHKitten.check_bootstrap.<locals>.check_untar_or_fail at 0x3f805dbb00>
self = <kitty_tests.PTY object at 0x3f820c0290>
timeout = 60
TimeoutError: The condition was not met. Screen contents:
''
----------------------------------------------------------------------
Ran 144 tests in 155.519s
Environment details
Observed on Debian Gnu/Linux. Build log: https://buildd.debian.org/status/fetch.php?pkg=kitty&arch=riscv64&ver=0.34.1-1&stamp=1714553712&raw=0
Additional context
This seems to be a regression in 0.34.1. The same test was passing in 0.33.1 release. Old log here
I am afraid I dont know anything about those archs and dont really care
about them, but patches are welcome. The test failure indicates that the
python process is hanging running bootstrap.py.
I did the following:
- Checked this on a riscv machine myself and it seems to sometimes fail, i.e. flaky test. ssh_copy is taking some time/stuck.
- Asked a risc porter and they told me it may be due to the arch being slow and hence timeout issues.
On increasing the timeout of check_untar_or_fail
to 180, I did not get the failure after running the tests around 10 times. It more or less does look like a timeout pitfall but I am not 100% certain.
On trying the same build (no code changes) in our build machines 5 times, I could get a successful build and hence the test seems flaky here (possibly due to timeout).
What do you think?
I am fine with increasing the timeout though untarring a small file
should never take more than a few seconds.
No wait that error comes up again when I triggered it 5 more times :(