Unable to run MineRules example.
Closed this issue · 2 comments
I tried running the MineRules.scala with the file MineRules_sampledata.tsv provided in the resource folder. After parsing the file, I get an exception Unable to infer schema for Parquet.
This is the complete traceback.
Exception in thread "main" org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:189)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:189)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$getOrInferFileFormatSchema(DataSource.scala:188)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425)
at net.sansa_stack.ml.spark.mining.amieSpark.KBObject$KB$$anonfun$countProjectionQueriesDF$1$$anonfun$apply$7.apply(KBObject.scala:944)
at net.sansa_stack.ml.spark.mining.amieSpark.KBObject$KB$$anonfun$countProjectionQueriesDF$1$$anonfun$apply$7.apply(KBObject.scala:906)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at net.sansa_stack.ml.spark.mining.amieSpark.KBObject$KB$$anonfun$countProjectionQueriesDF$1.apply(KBObject.scala:906)
at net.sansa_stack.ml.spark.mining.amieSpark.KBObject$KB$$anonfun$countProjectionQueriesDF$1.apply(KBObject.scala:905)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at net.sansa_stack.ml.spark.mining.amieSpark.KBObject$KB.countProjectionQueriesDF(KBObject.scala:905)
at net.sansa_stack.ml.spark.mining.amieSpark.KBObject$KB.addDanglingAtom(KBObject.scala:1686)
at net.sansa_stack.ml.spark.mining.amieSpark.MineRules$Algorithm.refine(MineRules.scala:214)
at net.sansa_stack.ml.spark.mining.amieSpark.MineRules$Algorithm$$anonfun$ruleMining$1$$anonfun$apply$mcVI$sp$1.apply$mcVI$sp(MineRules.scala:158)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at net.sansa_stack.ml.spark.mining.amieSpark.MineRules$Algorithm$$anonfun$ruleMining$1.apply$mcVI$sp(MineRules.scala:131)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at net.sansa_stack.ml.spark.mining.amieSpark.MineRules$Algorithm.ruleMining(MineRules.scala:73)
at net.sansa_stack.examples.spark.ml.mining.MineRules$.main(MineRules.scala:60)
at net.sansa_stack.examples.spark.ml.mining.MineRules.main(MineRules.scala)
Process finished with exit code 1
How should I provide a schema for the Parquet, or is there some other mechanism ?
Hi @saist1993 ,
many thanks for trying out the Rule Mining example. Did you try develop branch or master branch? Develop branch uses none Parquet Approach and you do not need to specify any schema or hdfs configuration.
Please, could you give it a try and let us know if you are facing the same issue.
Hi @GezimSejdiu ,
Thanks for the response. I used the master branch when I was trying out the Rule Mining example. The examples seem to work for the develop branch. Thanks for resolving the issue.