xelerance/xl2tpd

l2tp does not work with 4.15 mainline kernels

amatsoukas opened this issue ยท 43 comments

Recently we updated our server kernel to mainline 4.15.4 because of the meltdown and spectre vulnerabilities, only to find out that l2tp over ipsec connections are failing with an error:

xl2tpd[1203]: message repeated 3 times: [ udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device]

We have tested all 4.15 kernels with the same results. Reverting to kernel 4.14.15 resolves the issue.

Our production runs on centos 7 with the vpn server running in a docker container. However for testing purposes (and in order to find out if docker was the issue here) we installed the same configuration on an ubuntu 14.04 server with mainline 4.15.5 kernel and received the same error again.

I would also like to add, that i also tried to compile the latest xl2tpd version with the same results

I copy paste the full log from a connection attempt:

Feb 23 16:55:18 testvpn xl2tpd[1203]: Written by Mark Spencer, Copyright (C) 1998, Adtran, Inc.
Feb 23 16:55:18 testvpn xl2tpd[1203]: Forked by Scott Balmos and David Stipp, (C) 2001
Feb 23 16:55:18 testvpn xl2tpd[1203]: Inherited by Jeff McAdams, (C) 2002
Feb 23 16:55:18 testvpn xl2tpd[1203]: Forked again by Xelerance (www.xelerance.com) (C) 2006
Feb 23 16:55:18 testvpn xl2tpd[1203]: Listening on IP address 0.0.0.0, port 1701
Feb 23 16:55:18 testvpn kernel: [ 7.941925] init: plymouth-upstart-bridge main process ended, respawning
Feb 23 16:55:20 testvpn ntpdate[595]: Can't find host ntp.ubuntu.com: Name or service not known (-2)
Feb 23 16:55:20 testvpn ntpdate[595]: no servers can be used, exiting
Feb 23 16:55:26 testvpn ntpdate[1280]: step time server 91.189.89.199 offset -0.041934 sec
Feb 23 16:55:38 testvpn ntpdate[1282]: adjust time server 91.189.89.199 offset -0.000051 sec
Feb 23 16:57:38 testvpn charon: 03[NET] received packet: from 80.107.132.224[500] to 82.196.1.88[500] (788 bytes)
Feb 23 16:57:38 testvpn charon: 03[ENC] parsed ID_PROT request 0 [ SA V V V V V V V V V V V V ]
Feb 23 16:57:38 testvpn charon: 03[IKE] received NAT-T (RFC 3947) vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-08 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-07 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-06 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-05 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-04 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-03 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-02 vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received draft-ietf-ipsec-nat-t-ike-02\n vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received FRAGMENTATION vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] received DPD vendor ID
Feb 23 16:57:38 testvpn charon: 03[IKE] 80.107.132.224 is initiating a Main Mode IKE_SA
Feb 23 16:57:38 testvpn charon: 03[ENC] generating ID_PROT response 0 [ SA V V V ]
Feb 23 16:57:38 testvpn charon: 03[NET] sending packet: from 82.196.1.88[500] to 80.107.132.224[500] (136 bytes)
Feb 23 16:57:38 testvpn charon: 12[NET] received packet: from 80.107.132.224[500] to 82.196.1.88[500] (380 bytes)
Feb 23 16:57:38 testvpn charon: 12[ENC] parsed ID_PROT request 0 [ KE No NAT-D NAT-D ]
Feb 23 16:57:38 testvpn charon: 12[IKE] remote host is behind NAT
Feb 23 16:57:38 testvpn charon: 12[ENC] generating ID_PROT response 0 [ KE No NAT-D NAT-D ]
Feb 23 16:57:38 testvpn charon: 12[NET] sending packet: from 82.196.1.88[500] to 80.107.132.224[500] (396 bytes)
Feb 23 16:57:39 testvpn charon: 13[NET] received packet: from 80.107.132.224[4500] to 82.196.1.88[4500] (108 bytes)
Feb 23 16:57:39 testvpn charon: 13[ENC] parsed ID_PROT request 0 [ ID HASH N(INITIAL_CONTACT) ]
Feb 23 16:57:39 testvpn charon: 13[CFG] looking for pre-shared key peer configs matching 82.196.1.88...80.107.132.224[192.168.1.181]
Feb 23 16:57:39 testvpn charon: 13[CFG] selected peer config "l2tp"
Feb 23 16:57:39 testvpn charon: 13[IKE] IKE_SA l2tp[1] established between 82.196.1.88[82.196.1.88]...80.107.132.224[192.168.1.181]
Feb 23 16:57:39 testvpn charon: 13[IKE] scheduling reauthentication in 27726s
Feb 23 16:57:39 testvpn charon: 13[IKE] maximum IKE_SA lifetime 28266s
Feb 23 16:57:39 testvpn charon: 13[ENC] generating ID_PROT response 0 [ ID HASH ]
Feb 23 16:57:39 testvpn charon: 13[NET] sending packet: from 82.196.1.88[4500] to 80.107.132.224[4500] (92 bytes)
Feb 23 16:57:39 testvpn charon: 15[NET] received packet: from 80.107.132.224[4500] to 82.196.1.88[4500] (332 bytes)
Feb 23 16:57:39 testvpn charon: 15[ENC] parsed QUICK_MODE request 2220260207 [ HASH SA No ID ID NAT-OA NAT-OA ]
Feb 23 16:57:39 testvpn charon: 15[ENC] generating QUICK_MODE response 2220260207 [ HASH SA No ID ID NAT-OA NAT-OA ]
Feb 23 16:57:39 testvpn charon: 15[NET] sending packet: from 82.196.1.88[4500] to 80.107.132.224[4500] (204 bytes)
Feb 23 16:57:39 testvpn charon: 02[NET] received packet: from 80.107.132.224[4500] to 82.196.1.88[4500] (76 bytes)
Feb 23 16:57:39 testvpn charon: 02[ENC] parsed QUICK_MODE request 2220260207 [ HASH ]
Feb 23 16:57:39 testvpn charon: 02[IKE] CHILD_SA l2tp{1} established with SPIs ccbd4355_i 0074d98d_o and TS 82.196.1.88/32[udp/l2f] === 80.107.132.224/32[udp/56435]
Feb 23 16:57:40 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:42 testvpn xl2tpd[1203]: message repeated 2 times: [ udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device]
Feb 23 16:57:42 testvpn xl2tpd[1203]: control_finish: Peer requested tunnel 40 twice, ignoring second one.
Feb 23 16:57:42 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:44 testvpn xl2tpd[1203]: message repeated 2 times: [ udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device]
Feb 23 16:57:45 testvpn xl2tpd[1203]: Maximum retries exceeded for tunnel 33048. Closing.
Feb 23 16:57:45 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:45 testvpn xl2tpd[1203]: Connection 40 closed to 80.107.132.224, port 56435 (Timeout)
Feb 23 16:57:46 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:46 testvpn xl2tpd[1203]: control_finish: Peer requested tunnel 40 twice, ignoring second one.
Feb 23 16:57:46 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:49 testvpn xl2tpd[1203]: message repeated 3 times: [ udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device]
Feb 23 16:57:50 testvpn xl2tpd[1203]: control_finish: Peer requested tunnel 40 twice, ignoring second one.
Feb 23 16:57:50 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:50 testvpn xl2tpd[1203]: Unable to deliver closing message for tunnel 33048. Destroying anyway.
Feb 23 16:57:54 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:57 testvpn xl2tpd[1203]: message repeated 3 times: [ udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device]
Feb 23 16:57:58 testvpn xl2tpd[1203]: control_finish: Peer requested tunnel 40 twice, ignoring second one.
Feb 23 16:57:58 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:58 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:59 testvpn xl2tpd[1203]: Maximum retries exceeded for tunnel 54296. Closing.
Feb 23 16:57:59 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:57:59 testvpn xl2tpd[1203]: Connection 40 closed to 80.107.132.224, port 56435 (Timeout)
Feb 23 16:57:59 testvpn charon: 09[NET] received packet: from 80.107.132.224[4500] to 82.196.1.88[4500] (92 bytes)
Feb 23 16:57:59 testvpn charon: 09[ENC] parsed INFORMATIONAL_V1 request 2241479364 [ HASH D ]
Feb 23 16:57:59 testvpn charon: 09[IKE] received DELETE for ESP CHILD_SA with SPI 0074d98d
Feb 23 16:57:59 testvpn charon: 09[IKE] closing CHILD_SA l2tp{1} with SPIs ccbd4355_i (576 bytes) 0074d98d_o (0 bytes) and TS 82.196.1.88/32[udp/l2f] === 80.107.132.224/32[udp/56435]
Feb 23 16:57:59 testvpn charon: 09[NET] received packet: from 80.107.132.224[4500] to 82.196.1.88[4500] (108 bytes)
Feb 23 16:57:59 testvpn charon: 09[ENC] parsed INFORMATIONAL_V1 request 2825066879 [ HASH D ]
Feb 23 16:57:59 testvpn charon: 09[IKE] received DELETE for IKE_SA l2tp[1]
Feb 23 16:57:59 testvpn charon: 09[IKE] deleting IKE_SA l2tp[1] between 82.196.1.88[82.196.1.88]...80.107.132.224[192.168.1.181]
Feb 23 16:58:00 testvpn xl2tpd[1203]: udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device
Feb 23 16:58:03 testvpn xl2tpd[1203]: message repeated 3 times: [ udp_xmit failed to 80.107.132.224:56435 with err=-1:No such device]
Feb 23 16:58:04 testvpn xl2tpd[1203]: Unable to deliver closing message for tunnel 54296. Destroying anyway.

/etc/ipsec.conf

config setup
cachecrls=yes
uniqueids=yes

conn ipsec
keyexchange=ikev1
authby=xauthpsk
xauth=server
left=%defaultroute
leftsubnet=0.0.0.0/0
right=%any
rightsubnet=10.8.0.0/24
rightsourceip=%dhcp
rightdns=8.8.8.8
auto=add

conn l2tp
authby=secret
pfs=no
auto=add
keyingtries=3
ikelifetime=8h
keylife=1h
type=transport
left=%defaultroute
leftprotoport=17/1701
right=%any
rightprotoport=17/%any
dpddelay=10
dpdtimeout=20
dpdaction=clear

/etc/xl2tpd/xl2tpd.conf

[global]
ipsec saref = yes
saref refinfo = 30

debug avp = no
debug network = no
debug state = no
debug tunnel = no

[lns default]
ip range = 10.8.1.10-10.8.1.230
local ip = 10.8.1.3
refuse pap = yes
refuse chap = yes
require authentication = yes
ppp debug = no
pppoptfile = /etc/ppp/options.xl2tpd
length bit = yes

/etc/ppp/options.xl2tpd

name l2tp
auth
require-mschap-v2
ms-dns 8.8.4.4
ms-dns 8.8.8.8
idle 1800
nodefaultroute
lock
nobsdcomp
novj
novjccomp
nologfd
lcp-echo-interval 5
lcp-echo-failure 5
plugin radius.so
plugin radattr.so

I second that. Since switching kernel to 4.15.x I running into the same error with gentoo and networkmanager. Booting back into a 4.14.x kernel and everything is working again.

I can confirm this bug with kernel 4.15.9-300 (Fedora 27).

I had initially some difficulty reproducing this issue but I was able to reproduce on Ubuntu 16.04.4 when I rebooted both sides (the issue didn't occur if my LAC was not rebooted)

Unfortunately I can confirm it is still an issue with kernel-4.16.0 RC6 and not just kernel-4.15.x

I observed the same behaviour with 4.15.8. I dug little bit into this. I found that sendmsg() in udp_xmit() returns -1 with errno = ENODEV. in_pktinfo.ipi_ifindex contains some weird value. This value is received from recvmsg() in network_thread loop(). All other fields looks good. I tried to zero ipi_ifindex and packet transmission looks working again.

I also confirm the issue on Arch with kernel 4.15.10.

Expanding on what @jhemzal wrote, on Linux, xl2tpd currently uses the ancillary IP_PKTINFO data received during recvmsg(2) and passes it as-is (including ipi_ifindex) to sendmsg(2). The in_pktinfo structure contains the following members:

struct in_pktinfo {
    unsigned int   ipi_ifindex;  /* Interface index */
    struct in_addr ipi_spec_dst; /* Local address */
    struct in_addr ipi_addr;     /* Header Destination address */
};

Relevant strace xl2tpd output running on kernel 4.15 below. Note how ipi_ifindex gets set to an apparently random, large integer, but interface indexes are supposed to be small values. I'll also talk about IP_IPSEC_REFINFO later :

[pid  415] socket(AF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
[pid  415] setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid  415] setsockopt(3, SOL_SOCKET, SO_NO_CHECK, [1], 4) = 0
[pid  415] bind(3, {sa_family=AF_INET, sin_port=htons(1701), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid  415] setsockopt(3, SOL_IP, IP_IPSEC_REFINFO, "\1\0\0\0\0\0\0\0", 8) = -1 ENOPROTOOPT (Protocol not available)
[pid  415] getpid()                    = 415
[pid  415] write(2, "xl2tpd[415]: setsockopt recvref"..., 61xl2tpd[415]: setsockopt recvref[30]: Protocol not available) = 61
[pid  415] setsockopt(3, SOL_IP, IP_PKTINFO, "\1\0\0\0\0\0\0\0", 8) = 0
...
[pid  415] recvmsg(3, {msg_name={...}, msg_namelen=16, msg_iov=[{...}], msg_iovlen=1,
msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO,
cmsg_data={ipi_ifindex=2474634784, ipi_spec_dst=inet_addr(...), ipi_addr=inet_addr(...)}}],
msg_controllen=32, msg_flags=0}, 0) = 89
...
[pid  415] sendmsg(3, {msg_name={sa_family=AF_INET, sin_port=htons(1701), sin_addr=inet_addr(...)},
msg_namelen=16, msg_iov=[{iov_base=..., iov_len=20}], msg_iovlen=1,
msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO,
cmsg_data={ipi_ifindex=2474634784, ipi_spec_dst=inet_addr(...), ipi_addr=inet_addr(...)}}],
msg_controllen=28, msg_flags=0}, 0) = -1 ENODEV (No such device)

On a kernel where xl2tpd works, the recvmsg() strace output looks something like:

[pid  413] recvmsg(3, {msg_name={...}, msg_namelen=16, msg_iov=[{...}], msg_iovlen=1,
msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO,
cmsg_data={ipi_ifindex=if_nametoindex("ens33"), ipi_spec_dst=inet_addr(...), ipi_addr=inet_addr(...)}}],
msg_controllen=32, msg_flags=0}, 0) = 89

Extract from xl2tpd.conf manpage :

ipsecsaref
Use IPsec Security Association tracking. When this is enabled, packets received
by xl2tpd should have two extra fields (refme and refhim) which allows tracking
of multiple clients using the same internal NATed IP address, and allows
tracking of multiple clients behind the same NAT router. This needs to be
supported by the kernel.

Currently, this only works with Openswan KLIPS in "mast" mode. (see http://www.openswan.org/)

Set this to yes and the system will provide proper SAref values in the recvmsg() calls.

Values can be yes or no. The default is no.

xl2tp uses the ancillary IP_IPSEC_REFINFO data received during recvmsg() calls for SAref values.

Contrary to what the xl2tpd.conf manpage says, xl2tpd ignores ipsecsaref when it is set to no (i.e. the default value) and attempts to obtain ancillary IP_IPSEC_REFINFO and IP_PKTINFO data regardless.

So in summary, there is definitely a kernel bug when ancillary IP_PKTINFO data is attempted to be obtained, but xl2tpd doesn't need to obtain this ancillary data anyway (as most modern kernels won't have been patched for KLIPS).

Pull request #148 contains a fix for this issue, basically the code does the same thing as xl2tpd does on non-Linux operating systems where SAref values are not needed as there is no KLIPS support.

It would be great if a new xl2tpd release could be made which contains this pull request.

Thanks, @dkosovic, works like a charm. Until it is merged, here is an ebuild for Gentoo users (maybe someone will get here through google, like me).

@dkosovic Thank you very much for the patch. I have merged it into the 1.3.12dev branch.

I have updated the master branchs' README.xl2tpd to recommend users try out the 1.3.12dev branch.

For Fedora and EPEL 7, see Red Hat Bugzilla - Bug# 1562512. New xl2tpd RPMs with the patch have been pushed to the testing repositories.

For RHEL 7 or CentOS 7, temporarily enable EPEL testing repository to update to new xl2tpd RPM :

sudo yum update --enablerepo=epel-testing xl2tpd 

For Fedora 26 or later, temporarily enable testing repository to update to new xl2tpd RPM :

sudo dnf update --enablerepo=updates-testing xl2tpd 

Please vote for the updates to have them pushed out of testing to stable sooner, only needs 3 votes :

@szymonpk thanks for the overlay. Works with 4.15.X. Testing 4.16 today.

Bug report for Ubuntu 18.04 (Bionic Beaver) which will ship with kernel 4.15 later this month :

As Ubuntu 18.04 is no longer accepting newer packages from Debian Sid, I'm guessing that the patch would need to be applied to the existing xl2tpd-1.3.10-1 package.

I've also posted a Debian Sid bug report, but that is a non-issue as @shussain packages it and will no doubt make a new package once he makes a new xl2tpd release.

For the record, xl2tpd-1.3.8 broke also when upgrading my RPi here from Kernel 4.9.51 to 4.14.30.. is it recommended to already upgrade to xl2tpd-1.3.12dev ?

@Gooseman42 if you are seeing the udp_xmit failed ... with err=-1:No such device error like in the first message of this thread, then I would recommend upgrading to xl2tpd-1.3.12dev. Otherwise it is some other kernel issue.

Earlier versions of kernel 4.14 had a xfrm bug which broke IPsec. Perhaps the current bug you are seeing is because they backported something from kernel 4.15.

@dkosovic udp_xmit failed is the very error I see. So I think I would try to pull xl2tpd-1.3.12dev. How do I install this? Sorry for the noob question.. I cloned it and tried 'make' but get a:
cc -DDEBUG_PPPD -DTRUST_PPPD_TO_DIE -Os -Wall -DSANITY -DLINUX -I./linux/include/ -DUSE_KERNEL -DIP_ALLOCATION -c contrib/pfc.c contrib/pfc.c:14:23: fatal error: pcap-bpf.h: No such file or directory

# include <pcap-bpf.h>
compilation terminated.

Probably stupid mistake.. sorry for the noise.

Not sure which Linux distro you are running on the RPi, but you would most likely need to install a libpcap-dev or libpcap-devel package.

Not knowing anything about the linux distro you are using, it's probably safest to just copy the xl2tpd that got built on top of the system xl2tpd, e.g. something like :

sudo cp xl2tpd /usr/sbin/xl2tpd

Running Raspbian (Debian for RPi) here. libpcap-dev solved it and copying xl2tpd sorted everything else out. Works now again under Kernel 4.14.30-v7+, thanks!

Can confirm this is working on 4.15.0-2-amd64 #1 SMP Debian 4.15.11-1 (2018-03-20) x86_64 GNU/Linux
Built from source package via dpkg-buildpackage -b -uc -us (you'll need to update the changelog)

c-po commented

Actually I can confirm that this is a bug in Linux 4.14.x, too. I tested it with Linux Kernel 4.14.0 up to 4.14.30 on a Debian Jessie System using xl2tpd 1.3.6.

Porting the change pulled from #148 to the xl2tpd debian-jessie branch resulted in an again working system.

Hello,
How we can clone/get 1.3.12dev branch?
tried clone with -b branch (1.3.12) but its downloading 1.3.11 instead.

@idarek I believe this is a minor bug in the 1.3.12 branch: l2tp.h still has

#define SERVER_VERSION "xl2tpd-1.3.11"

in its code. So when you pull 1.3.12dev, you will get it but it incorrectly reports back being 1.3.11

Thanks.
I have pulled this https://github.com/vyos/xl2tpd/tree/cc909bbb20d3e6e216c3d11e0f328ff906f289a8 and solve my issue with 4.14 kernel on Raspberry Pi. Now can wait for official fix :)

Hi. I have updated 1.3.12 branch with 1.3.12rc1 tag and updated l2tp.h. Thanks @Gooseman42 for the quick response to @idarek 's response

Thank you for update. I have tested 1.3.12 with my 4.14.37-v7+ and issue with l2tp connection still exist. Only on VYOS fork is fixed. Is this will be implemented into dev on you official tree? Thanks.

For Ubuntu 18.04 (Bionic Beaver) as mentioned in #147 (comment), I've submitted a bug report here:

But please vote for the bug if it effects you as more votes will gain more attention and speed up the patch being incorporated into Ubuntu 18.04's xl2tpd-1.3.10 package.

I'm not an Ubuntu user, so not really familiar with the bug reporting process, perhaps there needs to be an xl2tpd SRU ?
https://wiki.ubuntu.com/StableReleaseUpdates

Hey @shussain,
Is there some instructions how I can install 1.3.12 on Ubuntu 17.10?

In terminal
git clone -b 1.3.12 https://github.com/xelerance/xl2tpd.git

However I used:
git clone -b cc909bbb20d3e6e216c3d11e0f328ff906f289a8 https://github.com/vyos/xl2tpd.git
as xelerance version got an issue with my kernel, but try 1st and if not working than second.

than
cd xl2tpd
make (you may need to install additional packages if fail)

see where your xl2tpd is maked, checked where is your current package installed in system
whereis xl2tpd
mine were in /usr/sbin/xl2tpd so I copy
sudo cp xl2tpd /usr/sbin/xl2tpd

than just restart xl2tpd and try.
sudo systemctl restart xl2tpd.service

Hey @idarek,

Thanks for your help.

I've installed 1.3.12 version, but now I've got an error:

โ— xl2tpd.service - LSB: layer 2 tunelling protocol daemon
   Loaded: loaded (/etc/init.d/xl2tpd; generated; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2018-05-05 15:48:50 UTC; 6s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 3634 ExecStart=/etc/init.d/xl2tpd start (code=exited, status=2)

May 05 15:48:50 localhost systemd[1]: Starting LSB: layer 2 tunelling protocol daemon...
May 05 15:48:50 localhost xl2tpd[3634]: Starting xl2tpd: start-stop-daemon: unable to start /usr/sbin/xl2tpd (Permission denied)
May 05 15:48:50 localhost systemd[1]: xl2tpd.service: Control process exited, code=exited status=2
May 05 15:48:50 localhost systemd[1]: Failed to start LSB: layer 2 tunelling protocol daemon.
May 05 15:48:50 localhost systemd[1]: xl2tpd.service: Unit entered failed state.
May 05 15:48:50 localhost systemd[1]: xl2tpd.service: Failed with result 'exit-code'.

I'm confused by next two strings

Starting xl2tpd: start-stop-daemon: unable to start /usr/sbin/xl2tpd (Permission denied)
Failed to start LSB: layer 2 tunelling protocol daemon.

I use sudo and it's very weird that I've got Permission denied. Any ideas?

could you drop output from
ls -l /usr/sbin/xl2tpd

I got mine on raspberry pi:
-rwxr-xr-x 1 pi pi 111660 Apr 27 07:25 xl2tpd

and is working fine, but maybe is need to be root
sudo chown root.root /usr/sbin/xl2tpd
?

$ ls -l /usr/sbin/xl2tpd
total 880
-rw-r--r-- 1 root root  15088 May  5 15:41 aaa.c
-rw-r--r-- 1 root root   1532 May  5 15:41 aaa.h
-rw-r--r-- 1 root root   8040 May  5 15:41 aaa.o
-rw-r--r-- 1 root root  51965 May  5 15:41 avp.c
-rw-r--r-- 1 root root   6018 May  5 15:41 avp.h
-rw-r--r-- 1 root root  43952 May  5 15:41 avp.o
-rw-r--r-- 1 root root   8342 May  5 15:41 avpsend.c
-rw-r--r-- 1 root root   6040 May  5 15:41 avpsend.o
-rw-r--r-- 1 root root     75 May  5 15:41 BUGS
-rw-r--r-- 1 root root  19078 May  5 15:41 call.c
-rw-r--r-- 1 root root   5047 May  5 15:41 call.h
-rw-r--r-- 1 root root   9824 May  5 15:41 call.o
-rw-r--r-- 1 root root  19283 May  5 15:41 CHANGES
-rw-r--r-- 1 root root    450 May  5 15:41 common.h
drwxr-xr-x 2 root root   4096 May  5 15:41 contrib
-rw-r--r-- 1 root root  63000 May  5 15:41 control.c
-rw-r--r-- 1 root root   2282 May  5 15:41 control.h
-rw-r--r-- 1 root root  36368 May  5 15:41 control.o
-rw-r--r-- 1 root root   1942 May  5 15:41 CREDITS
drwxr-xr-x 3 root root   4096 May  5 15:41 debian
drwxr-xr-x 2 root root   4096 May  5 15:41 doc
drwxr-xr-x 2 root root   4096 May  5 15:41 examples
-rw-r--r-- 1 root root  43151 May  5 15:41 file.c
-rw-r--r-- 1 root root   7715 May  5 15:41 file.h
-rw-r--r-- 1 root root  29072 May  5 15:41 file.o
-rw-r--r-- 1 root root    322 May  5 15:41 ipsecmast.h
-rw-r--r-- 1 root root   9130 May  5 15:41 l2tp.h
-rw-r--r-- 1 root root  18092 May  5 15:41 LICENSE
-rw-r--r-- 1 root root   5137 May  5 15:41 Makefile
-rw-r--r-- 1 root root   8724 May  5 15:41 md5.c
-rw-r--r-- 1 root root    620 May  5 15:41 md5.h
-rw-r--r-- 1 root root   3904 May  5 15:41 md5.o
-rw-r--r-- 1 root root   7469 May  5 15:41 misc.c
-rw-r--r-- 1 root root   1819 May  5 15:41 misc.h
-rw-r--r-- 1 root root   8160 May  5 15:41 misc.o
-rw-r--r-- 1 root root  23645 May  5 15:41 network.c
-rw-r--r-- 1 root root  16600 May  5 15:41 network.o
-rw-r--r-- 1 root root   1072 May  5 15:41 osport.h
drwxr-xr-x 5 root root   4096 May  5 15:41 packaging
-rwxr-xr-x 1 root root   8584 May  5 15:41 pfc
-rw-r--r-- 1 root root   2640 May  5 15:41 pfc.o
-rw-r--r-- 1 root root   3248 May  5 15:41 pty.c
-rw-r--r-- 1 root root   4064 May  5 15:41 pty.o
-rw-r--r-- 1 root root   1883 May  5 15:41 README.xl2tpd
-rw-r--r-- 1 root root   3874 May  5 15:41 scheduler.c
-rw-r--r-- 1 root root   1848 May  5 15:41 scheduler.h
-rw-r--r-- 1 root root   3448 May  5 15:41 scheduler.o
drwxr-xr-x 2 root root   4096 May  5 15:41 scripts
-rw-r--r-- 1 root root    757 May  5 15:41 TODO
-rwxr-xr-x 1 root root 127488 May  5 15:41 xl2tpd
-rw-r--r-- 1 root root  54169 May  5 15:41 xl2tpd.c
-rwxr-xr-x 1 root root  19072 May  5 15:41 xl2tpd-control
-rw-r--r-- 1 root root  13995 May  5 15:41 xl2tpd-control.c
-rw-r--r-- 1 root root  48568 May  5 15:41 xl2tpd.o

Try to stop xl2tpd service
sudo systemctl stop xl2tpd.service

and try to copy you maked xl2tpd file to /usr/sbin replacing original one
or
rm /usr/sbin/xl2tpd
and than copy to make sure its copied correctly

@idarek, thanks for your help again. Sorry, it was my mistake. I've copied xl2tpd folder instead of xl2tpd file. So, Permission denied error was eliminated, however Failed to start LSB error still appears.

$ sudo systemctl  status xl2tpd.service
โ— xl2tpd.service - LSB: layer 2 tunelling protocol daemon
   Loaded: loaded (/etc/init.d/xl2tpd; generated; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2018-05-05 16:48:02 UTC; 12s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 4116 ExecStart=/etc/init.d/xl2tpd start (code=exited, status=255)

May 05 16:48:02 localhost xl2tpd[4116]: xl2tpd server version xl2tpd-1.3.12rc1
May 05 16:48:02 localhost xl2tpd[4116]: Usage: xl2tpd-control [-c <PATH>] <command> <tunnel name> [<COMMAND OPTIONS>]
May 05 16:48:02 localhost xl2tpd[4116]:     -c        specifies xl2tpd control file
May 05 16:48:02 localhost xl2tpd[4116]:     -d        specify xl2tpd-control to run in debug mode
May 05 16:48:02 localhost xl2tpd[4116]: --help        shows extended help
May 05 16:48:02 localhost xl2tpd[4116]: Available commands: add, connect, disconnect, remove, add-lac, connect-lac, disconnect-lac, remove-lac, add-lns, remove-
May 05 16:48:02 localhost systemd[1]: xl2tpd.service: Control process exited, code=exited status=255
May 05 16:48:02 localhost systemd[1]: Failed to start LSB: layer 2 tunelling protocol daemon.
May 05 16:48:02 localhost systemd[1]: xl2tpd.service: Unit entered failed state.
May 05 16:48:02 localhost systemd[1]: xl2tpd.service: Failed with result 'exit-code'.

Could you try same as above, but with file xl2tpd-control
and restart service.

On my system I just need xl2tpd file and not need -control one, however in your may be different.

ps. small correction to the source that I used to fix my issue
git clone -b cc909bbb20d3e6e216c3d11e0f328ff906f289a8 https://github.com/vyos/xl2tpd.git

Yeah, I've copied xl2tpd-control as well as xl2tpd and it doesn't help

Could you try to check logs for errors
journalctl -u xl2tpd -b

So, I've reinstall xl2tpd from https://github.com/vyos/xl2tpd.git and it now works for me. @idarek thank you so much for your help!

Is there any work-in-progress trying to find the root cause in the kernel? It seems from the discussion here that the issue also appears on v4.14: v4.14.15 is good and v4.14.30 is bad.

I extracted code from network.c trying to emulate the same kind of interaction as with the above strace session, but I could not reproduce the issue on a qemu malta machine running OpenWrt with patched 4.14.37 kernel.

Here is the code: https://gist.github.com/yousong/303ed53967f7fbe6bbf804fea8a80309

Have released xl2tpd 1.3.12 which contains the fix in the master branch

Closing this ticket since the commit is in the master branch, and the README has been updated to tell users to use 1.3.12

I would like to thank @dksovic for the patch and @jhemzal for the initial debugging.

In addition to @amatsoukas @hagbartx @schakko @madewild for reporting and confirming the issue.

It was amazing how the whole community came together to report, debug and fix the issue. And to help each other out over installing/using the fix.

For Ubuntu users following this thread, both Ubuntu 18.04 (bionic) and 16.04 (xenial) are affected as kernel 4.15.0-29 is the latest kernel update for both.

xl2tpd packages with backported patch to workaround this kernel issue are now in their respective updates repositories: