Stratio/spark-solr

Error indexing docs

Closed this issue · 2 comments

When using Dataframes for indexing docs an error came up. A field _indexed_at_tdt is automatically added an we don't have it in our schema. Maybe it is usefull when there are no id field but in that case it should be optional.

We are using branch spark1_3_xAndSolr4.

Hi, we still have problems with this issue:

org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from server at http://427f4a3f7fb4:8983/solr/smf-agg-hour: ERROR: [doc=0_6_Magic_1442509200000] unknown field '_indexed_at_tdt'
    at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:625)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:967)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:856)
    at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:799)
    at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220)
    at com.lucidworks.spark.SolrSupport.sendBatchToSolr(SolrSupport.java:202)
    at com.lucidworks.spark.SolrSupport$4.call(SolrSupport.java:186)
    at com.lucidworks.spark.SolrSupport$4.call(SolrSupport.java:172)
    at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:198)
    at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:198)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:806)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:806)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1497)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1497)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
    at org.apache.spark.scheduler.Task.run(Task.scala:64)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://427f4a3f7fb4:8983/solr/smf-agg-hour: ERROR: [doc=0_6_Magic_1442509200000] unknown field '_indexed_at_tdt'
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227)
    at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:376)
    at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:328)
    at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:600)
    at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:597)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148)
    ... 3 more

We had tested with branch spark1_3_xAndSolr4_optionalDefaultIndex

We have to set the optional param split_default_index and it works fine. Thanks!