sorz/sstp-server

Improve performance

sorz opened this issue · 21 comments

sorz commented

Very high cpu usage
(bandwidth was around 20mbit/s but cpu usage was over 70% on sandy bridge xeon)

outReceived takes so many cpu cycles

VMprof report. See also #5.

/cc @deba12

sorz commented

Frame (un)escape, which outReceived() and writeFrame() did, are rewritten in C extension.
Related commits: 41ae170, d06dc94. VMprof report.

Although I still don't kown why those ate so many CPU cycles, it seem be much better now. 😅

I'm not familiar with C actually.
One segfault occurred during VMprof tests, but I'm fail to reproduce this problem again.
More tests and/or code review may required.

Impressive :)
Unfortunately i was able to test it only with pypy jit, but anyway performance much much better
on my test system (AMD A4-4000) before that commits i was able to do 1.5mB/s with cpu usage around 60%, now i am able to get 1.9 mB/s with cpu usage lower than 40%.
when i get to my office i will test with sandy bridge xeon to see performance with python interpreter

i spoke too fast...
There are memory leak some where
for 10 minutes sstpd eats around 2gb ram :) may be that's why you experience segfaults

sorz commented

(One of?) memory leak fixed in a1faea6.
Thanks :)

sorz commented

Fixed another one in 94eb5b6.

I think there are no more memory leaks :)
lets focus on performance i have friends that want to use it in real production
if performance increase two/three folds ;)

Performance on ivy Bridge xeon under kvm with realistic latency about 50ms
Python2: 1.3 mB/s with cpu usage about 30-35% with peaks around 45%
Pypy-Jit 1.0 mB/s with cpu usage about 15-19% with peaks around 22%
Client is windows 7 under vmware
i cant understand why pypy-jit unable to do 1.3mB/s such as python2 interpreter?
anyway if you have something more to test i am eager to do it :)

Hi,
I would be very interested to have SSTP working and I also think that this performance is rather poor ^^. Couldn't we write more code in C/C++?

What I was thinking someone have to write kernel module ppp over ssl (looking at you sorz :) )
which will handle data connection and python daemon which handles just control connection.
Then we will have an optimal performance.

May be i am too optimistic.
Just porting the whole thing to C with epoll will be sufficient for now

I was going through the code and it didn't seem to me as if it is much so it shouldn't be a problem to replace most of the code through C/C++ code? If there is some good architecture description it should be easy to rewrite this in C.
I think a kernel module is not neccessary because this does not make it much faster.

sorz commented

It sounds great but I have little experience in C. Sorry.
Microsoft's PDF has sufficient details about this protocol and how to implement it.

@davidweisgerber i think you should look how openl2tp was made
http://www.openl2tp.org/downloads

pppd plugin
openl2tp daemon

What could be interesting is developing a module for a HTTP(s) server like nginx because you have infrastructure for handling http(s) there. I just have to dig in how PPP is working on Linux and how it interacts with userland applications.
That's why I would prefer a C/C++ rewrite of sstp-server.

The main problem with servers like nginx is process spawn.
For every connection you have to spawn pppd process which will handle almost all.
There are some corner cases such as crypto binding, you have to make sstp frame parser in nginx to handle them :)

@davidweisgerber any progress?

@davidweisgerber try with stunnel and ssptd --no-ssl
i think is much better :)

@sorz I think that twisted is too slow for our case :(
take a look at
https://magic.io/blog/uvloop-blazing-fast-python-networking/

but requires python 3.5 (not that bad... )

sorz commented

Just fix two bugs caused random package damaged.
It affected TCP performance dramatically. (from ~3Mbps to ~50Mbps after fixed in my case)
Sorry for these bugs. qwq

@sorz great i will test them later today :) and report back with numbers.

@sorz yep i think performance is much better easily i can achieve 5-6mb/s over wifi.
with some older versions i have great success with stunnel in front of sstp i will try that way to see what happens later this week

It turns out that using nginx in-front of sstpd is quite easy.
I still haven't tested performance.