Error indexing docs
Closed this issue · 2 comments
danielcsant commented
When using Dataframes for indexing docs an error came up. A field _indexed_at_tdt is automatically added an we don't have it in our schema. Maybe it is usefull when there are no id field but in that case it should be optional.
We are using branch spark1_3_xAndSolr4.
sgomezg commented
Hi, we still have problems with this issue:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from server at http://427f4a3f7fb4:8983/solr/smf-agg-hour: ERROR: [doc=0_6_Magic_1442509200000] unknown field '_indexed_at_tdt'
at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:625)
at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:967)
at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:856)
at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:799)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220)
at com.lucidworks.spark.SolrSupport.sendBatchToSolr(SolrSupport.java:202)
at com.lucidworks.spark.SolrSupport$4.call(SolrSupport.java:186)
at com.lucidworks.spark.SolrSupport$4.call(SolrSupport.java:172)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:198)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:198)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:806)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:806)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1497)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1497)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://427f4a3f7fb4:8983/solr/smf-agg-hour: ERROR: [doc=0_6_Magic_1442509200000] unknown field '_indexed_at_tdt'
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227)
at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:376)
at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:328)
at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:600)
at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:597)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148)
... 3 more
We had tested with branch spark1_3_xAndSolr4_optionalDefaultIndex
sgomezg commented
We have to set the optional param split_default_index
and it works fine. Thanks!