Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist:

Question

Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist:

TinNGuyenntt opened this issue 2 years ago · 3 comments

Hi, I am a new hadoop learner. When I use your code to run hadoop it have this problem. I worked the Word Count example that run normally.(https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Partitioner). But your code not working when I command hadoop jar Bank_Transfers.jar Bank_Transfers. Please help me to understand. Thanks, have a good day.

 2023-03-18 00:45:31,378 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /127.0.0.1:8032

2023-03-18 00:45:31,626 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

2023-03-18 00:45:31,671 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/bigdata/.staging/job_1679072275142_0004

2023-03-18 00:45:31,946 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/bigdata/.staging/job_1679072275142_0004

Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/bigdata/bank_dataset

	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:340)

	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:279)

	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:404)

	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)

	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)

	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)

	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1571)

	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1568)

	at java.base/java.security.AccessController.doPrivileged(Native Method)

	at java.base/javax.security.auth.Subject.doAs(Subject.java:423)

	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)

	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1568)

	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1589)

	at Bank_Transfers.main(Bank_Transfers.java:113)

	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

	at java.base/java.lang.reflect.Method.invoke(Method.java:566)

	at org.apache.hadoop.util.RunJar.run(RunJar.java:323)

	at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

Caused by: java.io.IOException: Input path does not exist: hdfs://localhost:9000/user/bigdata/bank_dataset

	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)

	... 19 more

Answer 1 · 2023-03-17T19:04:42.000Z

Hello.

It appears that your setup cannot find your input data in HDFS.

"Input path does not exist: hdfs://localhost:9000/user/bigdata/bank_dataset"

It tries to find a "bank_dataset" file under the bigdata directory, but you mention giving "Bank_Transfers" as the argument.

Try to manually find your input file within the HDFS and then see what the argument is being used like in the code.

Answer 2 · 2023-03-17T19:22:45.000Z

Hi,

I have understand what to do with this problems. Thanks for your helping.

Have a good day

Answer 3 · 2023-03-18T20:23:53.000Z

Glad I could help.