Esri/geoportal-server-harvester

Error harvesting via CSW: Connection refused

d-coast opened this issue · 2 comments

Hi everyone,

we have tried harvesting from CSW for the Geonetwork version 2.0.2 profile, but are constantly running in the same error:

23-Oct-2019 15:59:21.936 SEVERE [HARVESTING] com.esri.geoportal.harvester.support.ErrorLogger.logError Error processing task: PROCESS:: status: working, title: NAME: IOW Task, PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=https://kueno.io-warnemuende.de/geonetwork/srv/ger/csw?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetCapabilities, cred-username=, cred-password=, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:GeoNetwork, csw-search-text=], DESTINATIONS: [GPT[gpt-host-url=http://XXXXX/geoportal, cred-username=XXXXX, cred-password=, gpt-index=metadata, gpt-cleanup=false, gpt-accept-xml=true, gpt-accept-json=false, gpt-translate-pdf=true]], INCREMENTAL: false, IGNOREROBOTSTXT: false | Error reading data.
com.esri.geoportal.harvester.api.ex.DataInputException: Error reading data.
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.hasNext(CswBroker.java:189)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$1(DefaultProcessor.java:148)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to kueno.io-warnemuende.de:443 [kueno.io-warnemuende.de/192.124.245.21] failed: Connection refused: connect

We tried changing the url to https://kueno.io-warnemuende.de/geonetwork/srv/ger/csw, but that also didn't work. The owners of the CSW endpoint assured us that no password is necessary. The only explanation that we still have is that our proxy might refuse any outside harvesting. We were able to harvest our own ArcGIS Portal and a local folder, but nothing external so far.

Does anybody know the source of this error?
Do you have an example csw configuration to test if our harvester works for that setup?
Does the harvester have any proxy settings that we could try to change?

Thank you for your help!

I have no problems harvesting that site (URL: https://kueno.io-warnemuende.de/geonetwork/srv/ger/csw, profile: urn:ogc:CSW:2.0.2:HTTP:APISO:GeoNetwork)

Thanks for testing! I now assume that we are having some proxy issues with the harvester. Has anyone else experienced proxy issues or knows if there are any proxy settings that might need to be changed?

Edit:
It was a proxy issue that has been resolved