Data pod is restarting with fatal error : java.lang.NoSuchMethodError: ‘java.util.List java.util.stream.Stream.toList()[BUG]
skumarp7 opened this issue · 3 comments
Hi,
Im using opensearch version 2.9
We have enabled Cross-Cluster replication with security enabled and trying to delete the index from the follower cluster by stopping the replication for each of the index. The data pod is restarting with the below error and sequence:
Install Leader and Follower as helm charts in 2 different k8s cluster. ( 1 data, 1 cluster_manager )
The indices are getting replicated from leader to follower on regular basis for the pattern “test*” using autofollow API
In the Follower, we are trying to delete the replicated indices -
a) Stop the replication using stop API
b) Delete the indices now using curator by configuring the curator to delete all the indices with prefix test
We are seeing the below error when it tries to delete the index and the data pod restarts
{"type":"log","level":"INFO","time": "2023-09-20T11:04:32.271Z","logger":"o.o.r.t.i.IndexReplicationTask","marker":"[sa-indexsearch-data-0] [test-2023.09.20] ","log":{"message":"In restoring state for test-2023.09.20"}} {"type":"log","level":"INFO","time": "2023-09-20T11:04:32.294Z","logger":"o.o.r.t.i.IndexReplicationTask","marker":"[sa-indexsearch-data-0] [test-2023.09.20] ","log":{"message":"Verifying task details - currentTask={isAssigned=true,executorNode=_IVFFZnnQv6HCTJ3hIAS1w}"}} {"type":"log","level":"INFO","time": "2023-09-20T11:04:32.296Z","logger":"o.o.r.t.i.IndexReplicationTask","marker":"[sa-indexsearch-data-0] [test-2023.09.20] ","log":{"message":"Replication stopped before restore could finish, so removing partial restore.."}} {"type":"log","level":"INFO","time": "2023-09-20T11:04:32.305Z","logger":"o.o.r.s.RemoteClusterRetentionLeaseHelper","marker":"[sa-indexsearch-data-0] ","log":{"message":"Removed retention lease with id - replication:sanjay-sa:V4SSQstZRvO_JQUYbKwADg:[test-2023.09.20][0]"}} {"type":"log","level":"INFO","systemid":"BSSC-1234","system":"BSSC","time": "2023-09-20T11:04:32.305Z","logger":"o.o.r.t.i.IndexReplicationTask","timezone":"UTC","marker":"[sa-indexsearch-data-0] [test-2023.09.20] ","log":{"message":"Deleting the index test-2023.09.20"}} {"type":"log","level":"ERROR","time": "2023-09-20T11:04:32.369Z","logger":"o.o.b.OpenSearchUncaughtExceptionHandler",,"marker":"[sa-indexsearch-data-0] ","log":{"message":"fatal error in thread [opensearch[sa-indexsearch-data-0][replication_follower][T#1]], exiting"}} java.lang.NoSuchMethodError: 'java.util.List java.util.stream.Stream.toList()' at org.opensearch.replication.task.index.IndexReplicationTask.doesValidIndexExists(IndexReplicationTask.kt:894) ~[opensearch-cross-cluster-replication-2.9.0.0.jar:2.9.0.0] at org.opensearch.replication.task.index.IndexReplicationTask.waitForRestore(IndexReplicationTask.kt:860) ~[opensearch-cross-cluster-replication-2.9.0.0.jar:2.9.0.0] at org.opensearch.replication.task.index.IndexReplicationTask.execute$suspendImpl(IndexReplicationTask.kt:189) ~[opensearch-cross-cluster-replication-2.9.0.0.jar:2.9.0.0] at org.opensearch.replication.task.index.IndexReplicationTask$execute$1.invokeSuspend(IndexReplicationTask.kt) ~[opensearch-cross-cluster-replication-2.9.0.0.jar:2.9.0.0] at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) [kotlin-stdlib-1.6.0.jar:1.6.0-release-798(1.6.0)] at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) [kotlinx-coroutines-core-jvm-1.6.0.jar:?] at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:849) [opensearch-2.9.0.jar:2.9.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] fatal error in thread [opensearch[sa-indexsearch-data-0][replication_follower][T#1]], exiting java.lang.NoSuchMethodError: 'java.util.List java.util.stream.Stream.toList()' at org.opensearch.replication.task.index.IndexReplicationTask.doesValidIndexExists(IndexReplicationTask.kt:894) at org.opensearch.replication.task.index.IndexReplicationTask.waitForRestore(IndexReplicationTask.kt:860) at org.opensearch.replication.task.index.IndexReplicationTask.execute$suspendImpl(IndexReplicationTask.kt:189) at org.opensearch.replication.task.index.IndexReplicationTask$execute$1.invokeSuspend(IndexReplicationTask.kt) at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:849) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)
The above error has occurred while deleting indices and suddenly the pod has restarted during the deletion of test-2023.09.20 index and it was able to delete few test* index before this exception.
The expectation is that it should not abruptly stop and restart the pod saying noSuchMethodError.
What is the root cause for this scenario and what is the impact. What happens after this restart? Will there be any issue with replication?
HI @skumarp7, Can you confirm if you are compiling the plugin locally ?
If yes, can you provide the JDK version that you are using
Hi @monusingh-1 ,
Im installing the plugin on top of opensearch delivered rpm as is (opensearch-2.7.0).
JDK version - JDK 11
Based on the error message, it does look like an issue with JDK where it is not able to resolve java.util.stream.Stream.toList()
. Can you verify again that JDK 11 is getting picked up?