MrPowers/spark-daria

Strange FileNotFoundException when running printAthenaCreateTable

Closed this issue · 5 comments

I'm getting a strange error. I'm not a regular Scala user, so I may be doing something silly.

First, I start a Spark shell as follows:

spark-shell --packages "org.apache.hadoop:hadoop-aws:2.7.6,mrpowers:spark-daria:0.32.0-s_2.11"

Then I run this code:

scala> val df = spark.read.parquet("s3a://...")
[Stage 0:>                                                          (0 + 1) 
                                                                            
df: org.apache.spark.sql.DataFrame = [... 96 more fields]

scala> import com.github.mrpowers.spark.daria.sql.DataFrameHelpers
import com.github.mrpowers.spark.daria.sql.DataFrameHelpers

scala> DataFrameHelpers.printAthenaCreateTable(
     |     df,
     |     "my.table",
     |     "s3a://..."
     | )
java.io.FileNotFoundException: /Users/powers/Documents/code/my_apps/spark-daria/target/scala-2.11/scoverage-data/scoverage.measurements.1 (No such file or directory)
  at java.io.FileOutputStream.open0(Native Method)
  at java.io.FileOutputStream.open(FileOutputStream.java:270)
  at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
  at java.io.FileWriter.<init>(FileWriter.java:107)
  at scoverage.Invoker$$anonfun$1.apply(Invoker.scala:42)
  at scoverage.Invoker$$anonfun$1.apply(Invoker.scala:42)
  at scala.collection.concurrent.TrieMap.getOrElseUpdate(TrieMap.scala:901)
  at scoverage.Invoker$.invoked(Invoker.scala:42)
  at com.github.mrpowers.spark.daria.sql.DataFrameHelpers$.printAthenaCreate
Table(DataFrameHelpers.scala:194)
  ... 53 elided

The reference to /Users/powers/ seems strange, and suggests some path from the project author's workstation got mistakenly baked into the package somehow.

Interestingly, I don't see this error if I switch to version 0.31.0, so this seems to be specific to 0.32.0.

dcox commented

Thank you, downgrading to 0.31.0 fixed this for me too.

@nchammas @dcox - Thanks for reporting this. This was caused by scalacov and I'm not sure I ever figured out why. I fixed the JAR in this commit: 41affb1 The latest version is v0.35.0.

@manuzhang @nvander1 - Let me know if either of you know how we can add scalacov to this repo? Thanks!

@MrPowers
Sorry for the long long delay. I've been working more on spark core/sql these days.

I've tried enabling code coverage and publing to local. Everything works for fine after I imported the local package.

I think the original issue reported here has since be resolved.