rexyai/RestRserve

Benchmark issue

Closed this issue ยท 3 comments

There is a new article in Polish (to translate please use Google Translator, it gives correct results) which compares the speed of three API services - Flask (Python), Plumber and RestRserve. A surprising result is that in all (two) tests RestRserve is more than 8 and 15 times slower than Flask and Plumber. What is the reason for this?

Repository with tests
The RestRserve services are in the start_api_2.R file.
The benchmark is in the test_api.py file.

  1. keep-alive is disabled which means each request
    • opens a new TCP connection
    • in case of RestRserve new connection triggers a fork a child process. Which might take 10-100ms.
  2. all requests are sequential

For proper benchmarks author should consider using proper benchmarking or load testing tools (ab, JMeter) and understand what he/she tries to measure.

s-u commented

@dselivanov this is hilarious, especially if you look at the code used to measure it :). It reminds me of the Julia sorting benchmarks where they claimed over 30x speedup compared R, but if you actually write it the way you would do it in R and measure real time for the task, R is faster than Julia. It's amazing how bogus benchmarks you can get if you don't know anything about the subject (well, or are looking for a way to prove your propaganda... ;)).

I played with the benchmark and the main issue is just incredibly crappy R code. When you simplify it to what any sane person would use (https://gist.github.com/s-u/4fda8f97f2fca6b924fadcbe042cc6b8) the iris case is always faster than flask despite the fork overhead:

1000 iteracji
## Rserve
/alive, 1000 iteracji, http://localhost:8092: 2480 ms, 2.48 ms/iter
/add, 1000 iteracji, http://localhost:8092: 2572 ms, 2.57 ms/iter
/iris, 1000 iteracji, http://localhost:8092: 11975 ms, 11.98 ms/iter

## Flask
/alive, 1000 iteracji, http://localhost:8090: 1961 ms, 1.96 ms/iter
/add, 1000 iteracji, http://localhost:8090: 2021 ms, 2.02 ms/iter
/iris, 1000 iteracji, http://localhost:8090: 13800 ms, 13.8 ms/iter

As you can see the fork() overhead is minimal - about 0.5ms - negligible compared to the runtime for any real work. (Note that this is without fixing the way the measurements are done - which should be done for a real benchmark as you pointed out.)