lucidworks/spark-solr

HttpClient 3.1 classes are imported, instead of HttpClient 4.x equivalents

theoathinas opened this issue · 3 comments

During our work to upgrade a project to a newer version of Hadoop (3.2), we discovered that the spark-solr connector imports a couple of HttpClient classes (NoHttpResponseException and ConnectTimeoutException) from the HttpClient 3.1 package (included by hadoop 2.7) instead of the version 4.x equivalent classes.

This was discovered during a Spark job test run where we index our data into Solr -- one of the hosts of our SolrCloud went down, and a NoHttpResponseException was supposed to be thrown. However, the class couldn't be found, because we had excluded all the hadoop dependencies at runtime, which meant HttpClient 3.1 was not added as a dependency.

Referencing HttpClient 3.1 classes would prevent the spark-solr connector from working with Hadoop version 2.8 or later, but changing the references now shouldn't affect its integration with hadoop 2.7.

got a PR for this here: #273

Thank you for your contribution. Merged the PR

thanks