LibVNC/x11vnc

Video dead loop (server side issue)

djfd opened this issue ยท 9 comments

djfd commented

Affected versions: all x64 from there

Environment: ArchLinux x64, gnome, xorg, gdm, mutter, all up to date
Hardware:

MSI-1221 (PR200)
BIOS: A1221IMS V1.51
CPU: Intel(R) Core(TM)2 Duo CPU T7700  @ 2.40GHz
GPU: Mobile GM965/GL960 Integrated Graphics Controller

Basically, all vnc sessions (I have tried all possible server parameters, just FYI) behave as follows:
when server is started it takes a screen snapshot, after a few time, when client connects, it takes another snapshot, and it is all, nothing more happens. Server renders these two snapshots to a client in endless loop.

To check this I made simple debugging patch (hashing snap buffer, to see if it actually changes):

diff -aur a/scan.c b/scan.c
--- a/scan.c	2015-11-15 03:49:21.000000000 +1000
+++ b/scan.c	2017-10-05 05:39:49.043697103 +1000
@@ -46,7 +46,7 @@
 #include "screen.h"
 #include "macosx.h"
 #include "userinput.h"
-
+#include <openssl/md5.h>
 /*
  * routines for scanning and reading the X11 display for changes, and
  * for doing all the tile work (shm, etc).
@@ -2796,6 +2796,8 @@
 	dtime0(&dt);
 	X_LOCK;
 
+        unsigned char digest[33]; digest[32] = 0;
+
 	/* screen may be too big for 1 shm area, so broken into fs_factor */
 	for (i=0; i < fs_factor; i++) {
 		XRANDR_SET_TRAP_RET(-1, "copy_snap-set");
@@ -2804,6 +2806,16 @@
 
 		memcpy(fbp, snaprect->data, (size_t) block_size);
 
+                MD5(fbp, block_size, digest);
+                for(int _dt = 31, _ds = 15; _ds >= 0 ; --_ds) {
+                    unsigned char _cin = digest[_ds],
+                                  _nibble = _cin & 0x0F;
+                    digest[_dt--] = _nibble + (_nibble > 9 ? ('a'-10) : '0');
+                    _nibble = (_cin >> 4) & 0x0F;
+                    digest[_dt--] = _nibble + (_nibble > 9 ? ('a'-10) : '0');
+                }
+                rfbLog("copy_snap(%d): %.32s\n", fs_factor, digest);
+
 		y += dpy_y / fs_factor;
 		fbp += block_size;
 	}

After long enough run I filtered the results

$ cat wrong.log |grep 'copy_snap(1)'|cut -f4 -d\ |sort -u
2a422a2f0642ed3567acf1987b5ac1cb
4e2abcb76b10ab14ff1b728caf0f2740
bfe34c8d50cb61afa12ec3c7256de42f

That shows the server actually serves only 2 pictures to a client. Then I killed vnc server and after a few time (a few minutes or so) run it again, that gives one more md5 checksum, keeping first three ones intact (this is very strange, looks as there is something cached somewhere, it is not expected to get the same initial screen on sub-sequential server run)

cat wrong.log |grep 'copy_snap(1)'|cut -f4 -d\ |sort -u
2a422a2f0642ed3567acf1987b5ac1cb
4e2abcb76b10ab14ff1b728caf0f2740
bfe34c8d50cb61afa12ec3c7256de42f
e9e5eb4961197d6801ffc5fe1bb91edb

Server is run with
sudo -u "#$uid" x11vnc -display "$dpy" -localhost -nopw -auth "$auth_file" -snapfb for the test

But, any other parameters combinations on vanilla x11vnc give the same endless loop.

Mouse and keyboard events do work fine. I can open apps, type commands etc, having both machines staying near by.

Another interesting thing, when there is screen lock curtain on the target machine, clicking mouse button in vnc client shows desktop, while on the target there is still curtain on the screen, need to slide mouse up to remove it.

Also tried with locking disabled, nothing changes, the same two pictures endlessly.

I have a logs, have a video showing the issue. Please let me know what else information, testing can I feed to help debugging and fixing issue.

bk138 commented

Hi!
First of all, sorry for the long delay - been busy with $$$-work ;-)

Can you provide the exact versions of Xorg, GNOME and (maybe) Wayland?

Can you also pls try https://www.reddit.com/r/archlinux/comments/4lro2k/how_do_i_make_sure_im_actually_running_wayland/

djfd commented

Hi,

Just in case checked the things as per reddit.com article, it is pure Xorg install there, on the target machine (wayland is disabled).

# pacman -Qs |egrep 'xorg-server\s|/gdm|gnome-(shell|session|desktop)\s|mutter'
local/gdm 3.26.2.1-1 (gnome)
local/gnome-desktop 1:3.26.2-1
local/gnome-session 3.26.1-2 (gnome)
local/gnome-shell 3.26.2+9+ga3736d3a3-1 (gnome)
local/mutter 3.26.2+31+gbf91e2b4c-1 (gnome)
local/xorg-server 1.19.5-1 (xorg)

Also I have a few similar installs [where x11vnc is working just fine, with no issues at all] with the exception that they all are using built in intel HD graphics, while this one has a separate GPU.

As I think, it uses some shared system memory, at least I cannot see any dedicated memory chips around the GPU. Can it be the issue origin? Say, graphics driver reports some memory address, then somehow switches the memory region while x11vnc still continues to use older value? Just an idea, maybe stupid, but who knows...

Or it can be some kind of open GL issue...

Thanks )

djfd commented

Hi,

I made one more test. Used gnome default vino vnc server. This piece of software is somehow working, screen is flickering (in vnc viewer), but I can see changes, at least. Current vino-server version is

vino-3.22.0+7+g74dd40f-1

However, if I run x11vnc (vino server is off at that time), kill it and restart vino-server, then it falls in the same endless loop of showing the same two (or three) pictures in the same endless loop, as x11vnc does.

Hope this information could help.

Also when dragging windows (both machines are staying nearby, for the visual feedback), sometimes I am able to see moving window's thin frame in the client. Not sure however if is it handled on server or client side.

Could it be client issue (eg. wrong timestamping or such)? I use remmina, but can try other client, if needed.

djfd commented

an update.

Finally I was able to run vanilla x11vnc by adding X11 configuration particle

#/etc/X11/xorg.conf.d/20-intel.conf
Section "Device"
  Identifier "Mobile GM965/GL960 Integrated Graphics Controller"
  Driver "intel"
  Option "AccelMethod" "UXA"
EndSection

It also fixed vino behaviour, it was working not very good too, but at least it was working (it was inserting the same initial frame every a second or so, but it also was showing actual changes)

Thus likely default "SNA" acceleration method (man intel) is somehow incompatible with both vino/x11vnc...

Not sure if we need to post bug report to intel driver team too or it is issue of x11vnc.

bk138 commented

You can also try to set back to SNA and play with the multitude of x11vnc settings regarding XComposite, snapfb and the others I can't remember.

bk138 commented

@djfd hace you tried running with x.org's modesetting driver instead of intel?

djfd commented

Hmm, good question

I simply do not remember. But I will try. Right now that machine is in use, but I will try when it will be freed.
I remember only that I made one extra change in the config, had a plan to post it here but just forgotten. Some later will look what is the current configuration and add it here too

djfd commented

Well, checked the configuration. Right now there is kernel parameter

i915.modeset=1

and 20-intel.conf under /etc/X11/xorg.conf.d

Section "Device"
  Identifier "Mobile GM965/GL960 Integrated Graphics Controller"
  Driver "intel"
#  Option "AccelMethod" "UXA"
  Option "AccelMethod" "SNA"
  Option "DRI" "2"
EndSection

Xorg modesetting was not tried yet. Do you still need that I tried it?

bk138 commented

Xorg modesetting was not tried yet. Do you still need that I tried it?

Yes please, see https://jlk.fjfi.cvut.cz/arch/manpages/man/modesetting.4