Esri/geoportal-server-harvester

Harvest from localhost fails

Closed this issue · 3 comments

I have instances of GeoPortal Harvester and Geonetwork both running on Tomcat on my local machine. I am trying to harvest the Geonetwork data with Harvester. I created a CSW broker, pointed it to my geonetwork instance, set up a task... but it will not run. Checking the log, I get this error:

30-Jul-2018 15:44:35.656 SEVERE [HARVESTING] com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$1 Error harvesting of PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=http://localhost:8081/geonetwork/srv/eng/csw, cred-username=admin, cred-password=*****, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:GeoNetwork], DESTINATIONS: [FOLDER[folder-root-folder=C:\Workspace\G.Young\HubProject\GeoPortal\Harvested, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: false
com.esri.geoportal.harvester.api.ex.DataInputException: Error reading data.
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.hasNext(CswBroker.java:165)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$1(DefaultProcessor.java:150)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.http.client.HttpResponseException: Unauthorized
at com.esri.geoportal.commons.csw.client.impl.Client.findRecords(Client.java:127)
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.hasNext(CswBroker.java:141)
... 2 more

I am able to successfully harvest a CSW from an external node on the web. Does this error have something to do with working with localhost?

Figured this out myself just after posting. The issue had to do with a setting in Geonetwork. If anyone else experiences this issue, the problem for me was that I was running Geonetwork off of port 8081, however the actual configuration file still said 8080. Changing the port in the config fixed the problem.

hello, may I know the logging file location of the havester as I tried to havester a CSW, the UI show "Error reading data.".
However, I cannot find the detail log of the tasks.

check your logging.properties found in ...\harvester\WEB-INF\classes. This includes a line similar to the below:

org.apache.juli.FileHandler.directory = C:/data/logs

this sets the location of the log output

you will also see line or end of line like .level = ERROR. This sets the level of detail of logging. Since we're using Juli for logging, see the Tomcat docs for valid levels: https://docs.progress.com/bundle/pas-for-openedge-administration-117/page/Tomcat-logging.html