spring-attic/spring-hadoop-samples

Windows Client Support?

Opened this issue · 7 comments

Does Spring Hadoop support running from a Windows client? I assume it does, since I see windows specific batch files to execute in the map reduce example.

When I build and run on a Windows client, connecting to my cluster, it fails. First it says it can't load native libs and then it submits the job but fails after that.

11:40:41,919  INFO t.support.ClassPathXmlApplicationContext: 510 - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@659297ab: startup date [Tue Feb 11 11:40:41 EST 2014]; root of context hierarchy
11:40:42,176  INFO eans.factory.xml.XmlBeanDefinitionReader: 315 - Loading XML bean definitions from class path resource [META-INF/spring/application-context.xml]
11:40:42,895  INFO ort.PropertySourcesPlaceholderConfigurer: 172 - Loading properties file from class path resource [hadoop.properties]
11:40:42,922  INFO ctory.support.DefaultListableBeanFactory: 596 - Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@74ab6b5: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,runner]; root of factory hierarchy
11:40:43,166  INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
11:40:44,706 ERROR             org.apache.hadoop.util.Shell: 303 - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
    at org.apache.hadoop.conf.Configuration.getTrimmedStrings(Configuration.java:1546)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:519)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:466)
    at org.springframework.data.hadoop.mapreduce.JobFactoryBean.afterPropertiesSet(JobFactoryBean.java:208)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1547)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1485)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:524)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:461)
    at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:608)
    at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:197)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:172)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:158)
    at org.springframework.samples.hadoop.mapreduce.Wordcount.main(Wordcount.java:28)
11:40:45,142  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:45,245  INFO ramework.data.hadoop.mapreduce.JobRunner: 192 - Starting job [wordcountJob]
11:40:45,302  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:45,971  WARN org.apache.hadoop.mapreduce.JobSubmitter: 258 - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
11:40:46,080  INFO doop.mapreduce.lib.input.FileInputFormat: 287 - Total input paths to process : 1
11:40:46,422  INFO org.apache.hadoop.mapreduce.JobSubmitter: 394 - number of splits:1
11:40:46,441  INFO he.hadoop.conf.Configuration.deprecation: 840 - user.name is deprecated. Instead, use mapreduce.job.user.name
11:40:46,442  INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
11:40:46,444  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
11:40:46,444  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.name is deprecated. Instead, use mapreduce.job.name
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
11:40:46,451  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
11:40:46,451  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
11:40:46,452  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
11:40:46,452  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11:40:46,454  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
11:40:46,454  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
11:40:46,820  INFO org.apache.hadoop.mapreduce.JobSubmitter: 477 - Submitting tokens for job: job_1391711633872_0022
11:40:47,127  INFO      org.apache.hadoop.mapred.YARNRunner: 368 - Job jar is not present. Not adding any jar to the list of resources.
11:40:47,225  INFO doop.yarn.client.api.impl.YarnClientImpl: 174 - Submitted application application_1391711633872_0022 to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:47,291  INFO          org.apache.hadoop.mapreduce.Job:1272 - The url to track the job: http://http://hd-dn-01.grcrtp.local:8088/proxy/application_1391711633872_0022/
11:40:47,292  INFO          org.apache.hadoop.mapreduce.Job:1317 - Running job: job_1391711633872_0022
11:40:50,330  INFO          org.apache.hadoop.mapreduce.Job:1338 - Job job_1391711633872_0022 running in uber mode : false
11:40:50,332  INFO          org.apache.hadoop.mapreduce.Job:1345 -  map 0% reduce 0%
11:40:50,356  INFO          org.apache.hadoop.mapreduce.Job:1358 - Job job_1391711633872_0022 failed with state FAILED due to: Application application_1391711633872_0022 failed 2 times due to AM Container for appattempt_1391711633872_0022_000002 exited with  exitCode: 1 due to: Exception from container-launch: 
org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
    at org.apache.hadoop.util.Shell.run(Shell.java:379)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)


.Failing this attempt.. Failing the application.
11:40:50,434  INFO          org.apache.hadoop.mapreduce.Job:1363 - Counters: 0
11:40:50,470  INFO ramework.data.hadoop.mapreduce.JobRunner: 202 - Completed job [wordcountJob]
11:40:50,507  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:50,590  INFO ctory.support.DefaultListableBeanFactory: 444 - Destroying singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@74ab6b5: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,runner]; root of factory hierarchy

How are you building and running the example? The batch file that is generated is built via Maven Appassembler plug-in. They don't seem to work right if you have a deep directory structure - got some error about command being too long.

Also see some other error - Could not resolve placeholder 'app.home' - so not sure how well these generated batch files actually work. Not sure if anyone has run these examples successfully on Widows.

Is your Hadoop cluster on Windows as well?

I just updated the samples to use $basedir instead of $app.home since the generated batch file for Windows doesn't set the app.home system property. Ran the wordcount sample successfully on Windows.

I had edited the batch file so the class path was defined as:

set CLASSPATH="%BASEDIR%"\etc;"%REPO%"\*

This allowed all the files to be on the class path without making the command too long.

I found related open bugs:
https://issues.apache.org/jira/browse/YARN-1298
https://issues.apache.org/jira/browse/MAPREDUCE-4052

My hadoop cluster is a Linux based one. I am trying to go from a Windows client to a Linux cluster.

Nice find on that bug. My test ran fine since I was running against a Windows based Hadoop cluster, haven't tried going from Windows client to Linux cluster.

Hi I have installed hadoop 2.7.4 on windows 7. I tried to run the spring hadoop map reduce wordcount program but could not run on windows as sh ./target/appassembler/bin/wordcount cannot be run on windows.

When I tried to run the wordcount class as a standalone class I get the following exception:

log4j:WARN No appenders could be found for logger (org.springframework.context.support.ClassPathXmlApplicationContext). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'wordcountJob' defined in null: Could not resolve placeholder 'app.repo' in string value "file:${app.repo}/hadoop-examples-*.jar"; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'app.repo' in string value "file:${app.repo}/hadoop-examples-*.jar" at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:211) at org.springframework.context.support.PropertySourcesPlaceholderConfigurer.processProperties(PropertySourcesPlaceholderConfigurer.java:180) at org.springframework.context.support.PropertySourcesPlaceholderConfigurer.postProcessBeanFactory(PropertySourcesPlaceholderConfigurer.java:155) at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:265) at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:162) at org.springframework.context.support.AbstractApplicationContext.invokeBeanFactoryPostProcessors(AbstractApplicationContext.java:606) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:462) at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:197) at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:172) at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:158) at org.springframework.samples.hadoop.mapreduce.Wordcount.main(Wordcount.java:28) Caused by: java.lang.IllegalArgumentException: Could not resolve placeholder 'app.repo' in string value "file:${app.repo}/hadoop-examples-*.jar" at org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174) at org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126) at org.springframework.core.env.AbstractPropertyResolver.doResolvePlaceholders(AbstractPropertyResolver.java:204) at org.springframework.core.env.AbstractPropertyResolver.resolveRequiredPlaceholders(AbstractPropertyResolver.java:178) at org.springframework.context.support.PropertySourcesPlaceholderConfigurer$2.resolveStringValue(PropertySourcesPlaceholderConfigurer.java:175) at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveStringValue(BeanDefinitionVisitor.java:282) at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:209) at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitList(BeanDefinitionVisitor.java:228) at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:192) at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitPropertyValues(BeanDefinitionVisitor.java:141) at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitBeanDefinition(BeanDefinitionVisitor.java:82) at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:208) ... 10 more

How can i run this program?

Please advise

Figured this out by providing the complete path