spring-attic/spring-cloud-gcp

Huge amount of ScheduledFutureTask in memory in case pubsub.googleapis.com isn't available

AcridFantasy opened this issue · 7 comments

Describe the bug
If pubsub.googleapis.com isn't available, there is a huge amount of io.grpc.netty.shaded.io.netty.util.concurrent.ScheduledFutureTask objects instantiated (millions) that uses a huge amount of memory and can cause OOM and our service outage. Issue should be reproduced within 10 minutes after application is started. Prerequisite for this issue is that pubsub.googleapis.com isn't available. Originally it has been found in our docker environment, but I was able to reproduce it by simply adding this line to my /etc/hosts file:
127.0.0.2 pubsub.googleapis.com
Also, in our case, even full GC doesn't cleanup a lot of objects. For some reason, it is different for the sample project but it anyway causes OutOfMemoryError after some time.
This case is quite important for us because it can cause service outage (due to OOM) for all our service instances using PubSub. Fortunately, DNS failure happens rarely but still, it can happen and affect a lot of our users.

Screenshots from VisualVM tool and some extra information is attached.
heap_histo
mem_history
per_thread_allocation
memory_issue_gcp.nps.zip

Sample
I attached zip archive with the simple project reproducing the issue. Simply import it to your IDE and run main method in com.example.gcp_report.Main.kt file. Please don't forget to update your hosts as I mentioned above before run.
gcp_report.zip

Please let me know if there are any questions.

Do you experience the same issue with the 2.x version?
See: https://github.com/GoogleCloudPlatform/spring-cloud-gcp

Do you experience the same issue with the 2.x version?

No, we haven't tried yet, should we?

Do you experience the same issue with the 2.x version?

No, we haven't tried yet, should we?

Yes. The 1.x version is outside of the support period now.

Do you experience the same issue with the 2.x version?

No, we haven't tried yet, should we?

Yes. The 1.x version is outside of the support period now.

Ok, I'll try during working time.

If it's reproduced, should I re-create ticket to the repo you mentioned?

Thanks for the information.

GoogleCloudPlatform/spring-cloud-gcp#614 - same issue is reproduced and logged to the new repo for 2.0.4 version.
If there is no plans to fix this issue for the version I used here, then this issue can be closed.

Thanks

I'll close this one in favor of GoogleCloudPlatform/spring-cloud-gcp#614