apache/accumulo

Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForFateOperation

Closed this issue · 4 comments

What is the root cause?
Hello,

We are using Accumulo version 2.0.1 in a K8S environment. The set up has 2 master PODs, 2 name node PODs, 2 data node PODs, 5 t-server PODs and 3 zookeeper PODs.

I am trying to import table from another on-prem instance of Accumulo. I have copied data files related to a table from a remote instance of Accumulo/HDFS set up using distcp.

I moved the ZIP file exportMetadata.zip out of this folder. However, the behavior is same regardless.

-rw-r--r-- 1 accumulo accumulo  905595479 Feb 15 13:44 A001dd51.rf
-rw-r--r-- 1 accumulo accumulo 1004729909 Feb 15 13:45 A001d7g7.rf
-rw-r--r-- 1 accumulo accumulo  701917993 Feb 15 13:45 A001dar1.rf
-rw-r--r-- 1 accumulo accumulo  662897779 Feb 15 13:45 A001dcwm.rf
.
.
.

I am trying to perform the following command

root@accumulo> createtable sometable
.
root@accumulo sometable> importdirectory  <absolute directory path> true

I have assigned all permissions to the root user to the table. But I see the below exception stack trace

.
.

2024-02-26 15:39:32,563 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 1007612942
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.hadoop.io.compress.CodecPool]: Got recycled decompressor
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.hadoop.io.compress.CodecPool]: Got recycled decompressor
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Got a decompressor: 1007612942
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Got a decompressor: 1007612942
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 222286175
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 222286175
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 1346961194
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 1346961194
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 1007612942
2024-02-26 15:39:32,564 DEBUG [pool-10-thread-8] [org.apache.accumulo.core.file.rfile.bcfile.Compression]: Returned a decompressor: 1007612942
2024-02-26 15:39:33,078 ERROR [shell] [org.apache.accumulo.shell.Shell]: org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForFateOperation
2024-02-26 15:39:33,078 ERROR [shell] [org.apache.accumulo.shell.Shell]: org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForFateOperation
2024-02-26 15:39:33,078 DEBUG [shell] [org.apache.accumulo.shell.Shell]: org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForFateOperation
org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForFateOperation
        at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:388)
        at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:342)
        at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doBulkFateOperation(TableOperationsImpl.java:329)
        at org.apache.accumulo.core.clientImpl.bulk.BulkImport.load(BulkImport.java:142)
        at org.apache.accumulo.shell.commands.ImportDirectoryCommand.execute(ImportDirectoryCommand.java:52)
        at org.apache.accumulo.shell.Shell.execCommand(Shell.java:771)
        at org.apache.accumulo.shell.Shell.start(Shell.java:602)
        at org.apache.accumulo.shell.Shell.execute(Shell.java:517)
        at org.apache.accumulo.start.Main.lambda$execKeyword$0(Main.java:129)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForFateOperation
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
        at org.apache.accumulo.core.master.thrift.FateService$Client.recv_waitForFateOperation(FateService.java:155)
        at org.apache.accumulo.core.master.thrift.FateService$Client.waitForFateOperation(FateService.java:140)
        at org.apache.accumulo.core.clientImpl.TableOperationsImpl.waitForFateOperation(TableOperationsImpl.java:292)
        at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:358)
        ... 9 more

I do not see any entries in zookeeper logs, namenode logs and t-server logs. So, what could be the problem?

Note that listing Accumulo instance, volumes and t-servers all work fine. Creating a table works fine.

thanks in advance.

Is the import directory and the import files somewhere in hdfs under the accumulo namespace? Maybe something like /accumulo/infiles The import does a hdfs rename - from the import dir to a directory under the /accumulo/tables/[TID]/xxxx it does not copy the files - that's why they need to be in the same hdfs namespace.

No, they are present in a separate file system. What if I provide the directory path as file://<absolute directory path> ?

I see the same issue when I use importtable command. I presume the underlying problem is the same.

Yes - it would be the same problem. The root cause is that any bulk import does a hdfs move - that way it is a metadata operation. No file copy is performed - so the files must be in a namespace the includes /accumulo so that the move can succeed.

That worked.. thanks!