minhash_spark.py [UNABLE_TO_INFER_SCHEMA]
Closed this issue · 3 comments
When I use the Spark cluster to execute minhash_spark.py, I occasionally encounter [UNABLE-TO-INFER-SCHEMA] errors, as shown in the following figure. I don't know if it's a problem with the data. Because workers need to copy data to different machines. For files with errors, they can run normally after retransmission, but errors may also occur after a period of time. I don't know if the file movement or reading has an impact on Spark? Now I have set up an NFS server, which can ensure that the files read by each worker are consistent, but this problem still occurs. Can you help me analyze where the problem lies?
Thanks for sharing all the details. Could you verify that your checkpoint location is writable to Spark?
Based on the conversations in the issue linked, it does not seem there is something I can do to "solve" it other than checking the checkpoint write access. The default distributed and iterative algorithm was the whole reason why I chose it in the first place to speed it up.
I will add this issue to a QA section in case anyone encounter the same issue in the future.
Stale issue message