nathanmarz/dfs-datastores

NPE in VersionedTap.sourceConfInit

Opened this issue · 6 comments

When I create a new versioned tap (using VersionedKeyValSource from Scalding) I get an NPE:

Caused by: java.lang.NullPointerException
at org.apache.hadoop.mapred.FileInputFormat.getPathStrings(FileInputFormat.java:342)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:288)
at com.backtype.cascading.tap.VersionedTap.sourceConfInit(VersionedTap.java:88)
at com.backtype.cascading.tap.VersionedTap.sourceConfInit(VersionedTap.java:19)
at cascading.flow.hadoop.HadoopFlowStep.initFromSources(HadoopFlowStep.java:332)
at cascading.flow.hadoop.HadoopFlowStep.getInitializedConfig(HadoopFlowStep.java:99)
at cascading.flow.hadoop.HadoopFlowStep.createFlowStepJob(HadoopFlowStep.java:201)
at cascading.flow.hadoop.HadoopFlowStep.createFlowStepJob(HadoopFlowStep.java:69)
at cascading.flow.planner.BaseFlowStep.getFlowStepJob(BaseFlowStep.java:680)
at cascading.flow.BaseFlow.initializeNewJobsMap(BaseFlow.java:1148)
at cascading.flow.BaseFlow.initialize(BaseFlow.java:198)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:231)

I can avoid the NPE by doing:
$ hdfs -mkdir /user/foo/1
$ touch 1.version
$ hdfs -copyFromLocal 1.version /user/mikeg/xpm/1.version

@argyris thinks this bug may be from a recent change he committed

Are you using a relative path?

Michael N. Gagnon mailto:notifications@github.com
September 17, 2013 11:25 AM

When I create a new versioned tap (using VersionedKeyValSource from
Scalding) I get an NPE:

Caused by: java.lang.NullPointerException
at
org.apache.hadoop.mapred.FileInputFormat.getPathStrings(FileInputFormat.java:342)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:288)
at
com.backtype.cascading.tap.VersionedTap.sourceConfInit(VersionedTap.java:88)
at
com.backtype.cascading.tap.VersionedTap.sourceConfInit(VersionedTap.java:19)
at
cascading.flow.hadoop.HadoopFlowStep.initFromSources(HadoopFlowStep.java:332)
at
cascading.flow.hadoop.HadoopFlowStep.getInitializedConfig(HadoopFlowStep.java:99)
at
cascading.flow.hadoop.HadoopFlowStep.createFlowStepJob(HadoopFlowStep.java:201)
at
cascading.flow.hadoop.HadoopFlowStep.createFlowStepJob(HadoopFlowStep.java:69)
at
cascading.flow.planner.BaseFlowStep.getFlowStepJob(BaseFlowStep.java:680)
at cascading.flow.BaseFlow.initializeNewJobsMap(BaseFlow.java:1148)
at cascading.flow.BaseFlow.initialize(BaseFlow.java:198)
at
cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:231)

I can avoid the NPE by doing:
$ hdfs -mkdir /user/foo/1
$ touch 1.version
$ hdfs -copyFromLocal 1.version /user/mikeg/xpm/1.version

@argyris https://github.com/argyris thinks this bug may be from a
recent change he committed


Reply to this email directly or view it on GitHub
#39.

Sam Ritchie, Twitter Inc
703.662.1337
@sritchie

This ended up being a false alarm. However, there is still an issue with NPEs not being the most useful error message. I will send a pull request to add better logging.

What is the real issue at play here?

Argyris Zymnis mailto:notifications@github.com
September 17, 2013 2:30 PM

This ended up being a false alarm. However, there is still an issue
with NPEs not being the most useful error message. I will send a pull
request to add better logging.


Reply to this email directly or view it on GitHub
#39 (comment).

Sam Ritchie, Twitter Inc
703.662.1337
@sritchie

I was trying to read and write from the same, uninitialized source.

Ooooooooooooh

Michael N. Gagnon mailto:notifications@github.com
September 17, 2013 2:44 PM

I was trying to read and write from the same, uninitialized source.


Reply to this email directly or view it on GitHub
#39 (comment).

Sam Ritchie, Twitter Inc
703.662.1337
@sritchie

Addressed by #40