zalopay-oss/jmeter-grpc-request

Feature Request: Option to specify Channel Shutdown timeout

steverawlins-zebra opened this issue · 11 comments

Discussed in #121

Originally posted by steverawlins-zebra June 15, 2022
We have discovered, during high-volume testing (more than 1000 messages per second), CHANNEL SHUTDOWN problems within jmeter-grpc-request. Furthermore, once a CHANNEL SHUTDOWN triggers, the remainder of the test is useless - when we hit a problem, it's "game over" and what follows is nothing but a stream of CHANNEL SHUTDOWNs.

Please add a configuration option allowing us to increase that channel-shutdown timeout from the hard-coded 5, to a larger number such as 60 (seconds)

Thank you

P.S. The basis for our idea comes from examining the code in jmeter-grpc-request/src/main/java/vn/zalopay/benchmark/core

 public void shutdownNettyChannel() {
        try {
            if (channel != null) {
                channel.shutdown();
                channel.awaitTermination(5, TimeUnit.SECONDS);
            }
        } catch (InterruptedException e) {
            throw new RuntimeException("Caught exception while shutting down channel", e);
        }
    }

Can I know do you use the for-loop with the gRPC sampler, currently, this issue maybe happen if we use the for-loop with the gRPC sampler, so, when the channel is shut down, it can't send a request anymore. Please also provide your sample JMeter test plan to reproduce that issue for debugging purpose :D

here is our Test Plan, scrubbed of secrets and changed to .txt instead of .jmx

ForZalopay.txt

@minhhoangvn Thanks for this feature !
Wanted some info regarding the channel implementation. Are the channels being re-used or are we creating different channels for each of the request / concurrent user scenario?

Hi @alphadeepak the channel will create each thread/concurrent user

Hi @minhhoangvn , We are giving channel await termination time 20 min and still getting the channel shutdown issue , can you please suggest anything on this .

The test plan have 200 users with a duration of 30 min , We have observed that channel shutdown is getting invoked after 150+ users are ramped up.

Please note that its for same test plan shared by @steverawlins-zebra .

@Renu-Lamba, currently the sampler will invoke nettyShutdownChannel each time the virtual user has completed their job in the threadFinished event, so the exception is thrown to warning when the JMeter engine tries to shutdown nettyChannel. You can monitor your service to observe if the performance test can achieve a target RPS. Make sure the deadline timeout is greater than the timeout in your gRPC server since if the deadline timeout is reached before receiving the response data from the server, it will stop the thread and shutdown the nettyChannel, and the channel can throw the exception "Caught exception while shutting down channel"
Note: gRPC sampler will create a new nettyChannel each time the thread begins to start and shut down the channel in each thread has finished

@minhhoangvn thanks for sharing the info .

We keep getting the "GRPC Request - 500 : Exception: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked" error when number of users are above 100 , this issue never happens when users are less than 100.

Can you please suggest on this why its happening . We have given good amount for channel shutdown and deadline timeout to 20 min .

@Renu-Lamba, the error is related to your backend reaching the maximum connections it can establish. The gRPC sampler will have status code 500(response time is too long, the server can't establish a new connection) and close the channel to start a new interaction with the current virtual user. It would be best if you had a dashboard to monitor your backend service and some key metrics indicators that you can use to analyze current active connections, throughput (RPS/TPS), and p99 latency. So I think we can close this issue because should be out of the scope of the gRPC JMeter sampler

@minhhoangvn ,Thanks for all the help and suggestions ,Much appreciated

@minhhoangvn I am also facing the same issue, I am only using a single user, and once the request fails with a valid error all the subsequent requests fail and the response is "Channel shutdown invoked".
If I stop and start the script again it starts to work and then fails again after the first failure from the server side.