Can't communicate with kernel SCTP stack program.
asterwyx opened this issue · 10 comments
I've written a simple sctp client using interface provided by linux, part of my code is like below:
#include <arpa/inet.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <stdarg.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <stdbool.h>
#include <string.h>
#include <netinet/sctp.h>
static const uint16_t CONN_PORT = 7780;
static const int BUF_LEN = 4096;
static const char CONN_ADDR[] = "127.0.0.1";
int main(int argc, char *argv[])
{
int sock_fd;
int error;
char msg_buf[BUF_LEN];
struct sctp_sndrcvinfo info;
struct sctp_event_subscribe sub;
memset(&sub, 0, sizeof(struct sctp_event_subscribe));
sock_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
sub.sctp_data_io_event = 1;
sub.sctp_association_event = 1;
error = setsockopt(sock_fd, SOL_SCTP, SCTP_EVENTS, (char *)&sub, sizeof(sub));
if (0 != error)
{
fprintf(stderr, "SCTP_EVENTS: error %d\n", error);
}
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = inet_addr(CONN_ADDR);
addr.sin_port = htons(CONN_PORT);
error = connect(sock_fd, (struct sockaddr *)&addr, sizeof(addr));
if (0 != error)
{
fprintf(stderr, "Can't connect to %s, port %d, errno: %d, %s\n", inet_ntoa(addr.sin_addr), ntohs(addr.sin_port), errno, strerror(errno));
close(sock_fd);
exit(EXIT_FAILURE);
}
......
}
It confused me when I run echo_server provided by usrsctp and this simple sctp client. The connection failed, and it told me:
Can't connect to 127.0.0.1, port 7780, errno: 111, Connection refused
Then I tried to capture the packets to see what happened. I started Wireshark to capture packets on loopback NIC, then I got these records:
7406 1526.417963540 127.0.0.1 -> 127.0.0.1 SCTP 122 INIT
7407 1526.417984216 127.0.0.1 -> 127.0.0.1 SCTP 50 ABORT
Why did this happen? I've also written a simple server using Linux socket. And I've verified that my server and client could communicate with each other normally. They can complete INIT->INIT_ACK->COOKIE_ECHO->COOKIE_ACK process.
You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the ABORT
chunk comes from the kernel stack handling the packet containing the INIT
chunk.
Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.
You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the
ABORT
chunk comes from the kernel stack handling the packet containing theINIT
chunk.Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.
Why does the ABORT
packet coming from the kernel stack handle the packet containing the INIT
chunk? I tried to run usrsctp programs on two different hosts and the following is what I do:
- Run
echo_server
on a host whose IP address is 192.168.131.1.
./echo_server
- Run
client
on another host whose IP address is 192.168.131.2.
./client 192.168.131.1 7780
In the meantime, I ran tshark both on NIC whose IP address is 192.168.131.1 and on NIC whose IP address is 192.168.131.2 to monitor network traffic. I got below records:
306 14018.662563313 192.168.131.2 -> 192.168.131.1 SCTP 186 INIT
307 14018.662722705 192.168.131.1 -> 192.168.131.2 SCTP 60 ABORT
308 14018.662999216 192.168.32.1 -> 192.168.131.2 SCTP 682 INIT_ACK
It confused me that 192.168.32.1 is just a virtual NIC and there was no route between it and 192.168.131.2. Why did echo_server
change an address to send the INIT_ACK packet and why did it even send an ABORT
packet from INIT
packet's destination address to its source address?
PS: I have changed echo_server's listening port to 7780.
You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the
ABORT
chunk comes from the kernel stack handling the packet containing theINIT
chunk.
Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.Why does the
ABORT
packet coming from the kernel stack handle the packet containing theINIT
chunk? I tried to run usrsctp programs on two different hosts and the following is what I do:
When you have a kernel stack, packets get delivered to that stack. Since the kernel stack does not have states for associations handled in userland, it considers them as out of the blue and replies with a packet containing an ABORT chunk.
- Run
echo_server
on a host whose IP address is 192.168.131.1../echo_server
- Run
client
on another host whose IP address is 192.168.131.2../client 192.168.131.1 7780In the meantime, I ran tshark both on NIC whose IP address is 192.168.131.1 and on NIC whose IP address is 192.168.131.2 to monitor network traffic. I got below records:
306 14018.662563313 192.168.131.2 -> 192.168.131.1 SCTP 186 INIT 307 14018.662722705 192.168.131.1 -> 192.168.131.2 SCTP 60 ABORT 308 14018.662999216 192.168.32.1 -> 192.168.131.2 SCTP 682 INIT_ACK
It confused me that 192.168.32.1 is just a virtual NIC and there was no route between it and 192.168.131.2. Why did
echo_server
change an address to send the INIT_ACK packet and why did it even send anABORT
packet fromINIT
packet's destination address to its source address?
Two issues:
-
Either at he host owning 192.168.131.1 there must a second SCTP stack active, which sends the packet with the ABORT chunk, or there is a middlebox involved between 192.168.131.2 and 192.168.131.1 which sends the packet with the ABORT chunk.
-
The server chooses the first address it thinks it can use. It is a limitation of the userland code. It doesn't know the kernels routing table...
PS: I have changed echo_server's listening port to 7780.
You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the
ABORT
chunk comes from the kernel stack handling the packet containing theINIT
chunk.
Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.Why does the
ABORT
packet coming from the kernel stack handle the packet containing theINIT
chunk? I tried to run usrsctp programs on two different hosts and the following is what I do:When you have a kernel stack, packets get delivered to that stack. Since the kernel stack does not have states for associations handled in userland, it considers them as out of the blue and replies with a packet containing an ABORT chunk.
What does "have a kernel stack" mean? I installed lksctp-tools and lksctp-tools-devel on my server. Does this mean that I have installed a kernel stack? But I didn't either run any binary after installation or insert any module to the kernel. I guess the Linux kernel has support for SCTP default? In other words, how can I remove the kernel SCTP stack? Simply uninstall lksctp-tools and lksctp-tools-devel?
- Run
echo_server
on a host whose IP address is 192.168.131.1../echo_server
- Run
client
on another host whose IP address is 192.168.131.2../client 192.168.131.1 7780In the meantime, I ran tshark both on NIC whose IP address is 192.168.131.1 and on NIC whose IP address is 192.168.131.2 to monitor network traffic. I got below records:
306 14018.662563313 192.168.131.2 -> 192.168.131.1 SCTP 186 INIT 307 14018.662722705 192.168.131.1 -> 192.168.131.2 SCTP 60 ABORT 308 14018.662999216 192.168.32.1 -> 192.168.131.2 SCTP 682 INIT_ACK
It confused me that 192.168.32.1 is just a virtual NIC and there was no route between it and 192.168.131.2. Why did
echo_server
change an address to send the INIT_ACK packet and why did it even send anABORT
packet fromINIT
packet's destination address to its source address?Two issues:
- Either at the host owning 192.168.131.1, there must be a second SCTP stack active, which sends the packet with the ABORT chunk, or there is a middlebox involved between 192.168.131.2 and 192.168.131.1 which sends the packet with the ABORT chunk.
- The server chooses the first address it thinks it can use. It is a limitation of the userland code. It doesn't know the kernels routing table...
PS: I have changed echo_server's listening port to 7780.
Two issues:
- As you say, I guess that the Linux kernel SCTP stack received the
INIT
packet and replied with anABORT
packet. Then the problem is how can I prove this. - What does "it think it can use" mean? Why does the server choose the source address of the
INIT
packet as the destination address of theINIT_ACK
packet?
Thanks!
If you are using Linux, lsmod
lists the kernel modules, which are loaded. Does a module with the name sctp
show up?
Yes, I've just run this command:
lsmod | grep sctp
I got below output:
sctp 279238 2
libcrc32c 12644 4 xfs,sctp,nf_nat,nf_conntrack
So disabling kernel SCTP stack means removing the sctp module?
I tried it just now but was reminded that
rmmod: ERROR: Module sctp is in use
I've checked that my kernel client and server weren't running at the time.
Yes, I've just run this command:
lsmod | grep sctp
I got below output:
sctp 279238 2 libcrc32c 12644 4 xfs,sctp,nf_nat,nf_conntrackSo disabling kernel SCTP stack means removing the sctp module?
I tried it just now but was reminded thatrmmod: ERROR: Module sctp is in use
I've checked that my kernel client and server weren't running at the time.
I've not much experience with Linux, but I think you can't unload the sctp
once it is loaded. At least this was true in the past...
Yes, I've just run this command:
lsmod | grep sctp
I got below output:
sctp 279238 2 libcrc32c 12644 4 xfs,sctp,nf_nat,nf_conntrackSo disabling kernel SCTP stack means removing the sctp module?
I tried it just now but was reminded thatrmmod: ERROR: Module sctp is in use
I've checked that my kernel client and server weren't running at the time.
I've not much experience with Linux, but I think you can't unload the
sctp
once it is loaded. At least this was true in the past...
Thanks, I've looked up some references and found that to remove the sctp
module we need to add the -f
option. But this didn't work for my server yet, so I rebooted it.
Why don't we just use the destination address of the INIT packet to fill the source address of the INIT_ACK packet? I've looked through the code and found that the INIT_ACK packet's source address is filled using the INIT packet's destination address only in the loopback scope. But in other cases, usrsctp will re-choose a source address for the INIT_ACK packet. I don't know why this is better.
My problem still exists, I can't run echo_server and client separately on two servers. They can't even start up an association. I modified the code by myself. usrsctplib\netinet\sctp_output.c, line 6673 to 6677:
if (stc.loopback_scope) {
over_addr = (union sctp_sockstore *)dst;
} else {
over_addr = NULL;
}
I've changed to directly assign dst to over_addr like below:
over_addr = (union sctp_sockstore *)dst;
I remade it all and tested it again. Now COOKIE_ECHO packet can be sent and COOKIE ACK can be sent too. But the problem was that there was no data sent between the two. I saw consecutive COOKIE_ECHO packets wat sent. It seems that the client didn't recognize the COOKIE_ACK packet and kept sending COOKIE_ECHO until max try. What might the root cause be? I guess my modification is incomplete. Does this have something to do with the state cookie? Thanks!
The FreeBSD kernel stack uses the IP layer to determine the source address (based on the routing table). The userland stack has not this functionality.
Regarding the COOKIE-ACK: Can you enable the debug output and get an idea why it is not accepted. Which IP addresses are used for the handshake?