Error while trying to run the sample
javiermanzano opened this issue · 8 comments
Hi,
We've been trying to follow the instructions to run the sample locally and we've had this error:
example git:(main) ✗ ./ingest.py
Indexing BNL/L'Union articles
Downloading missing BNL/L'Union issues to data/bnl_lunion
42000/41446concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1346, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1253, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1299, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1248, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1008, in _send_output
self.send(msg)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 948, in send
self.connect()
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 919, in connect
self.sock = self._create_connection(
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/socket.py", line 843, in create_connection
raise err
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/socket.py", line 831, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/process.py", line 243, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/Users/paroar/solr-ocrhighlighting/example/ingest.py", line 219, in index_documents
resp = request.urlopen(req)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 517, in open
response = self._open(req, data)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 534, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain
result = func(*args)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1375, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1349, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 61] Connection refused>
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/paroar/solr-ocrhighlighting/example/./ingest.py", line 244, in <module>
fut.result()
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 438, in result
return self.__get_result()
File "/usr/local/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
urllib.error.URLError: <urlopen error [Errno 61] Connection refused>
➜ example git:(main) ✗
This causes the docker container to crash:
solr_1 | 2021-05-13 11:24:42.877 INFO (qtp322561962-51) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={wt=json&_=1620905074800} status=0 QTime=15
solr_1 | 2021-05-13 11:28:32.426 INFO (searcherExecutor-15-thread-1-processing-x:ocr) [ x:ocr] o.a.s.c.SolrCore [ocr] Registered new searcher autowarm time: 17 ms
solr_1 | 2021-05-13 11:28:32.451 INFO (qtp322561962-25) [ x:ocr] o.a.s.u.p.LogUpdateProcessorFactory [ocr] webapp=/solr path=/update params={softCommit=true}{add=[1533660_1860-11-14-1, 1533660_1860-11-14-2, 1533660_1860-11-14-3, 1533660_1860-11-14-4, 1533660_1860-11-14-5, 1533660_1860-11-14-6, 1533660_1860-11-14-7, 1533660_1860-11-14-8, 1533660_1860-11-14-9, 1533660_1860-11-14-10, ... (1000 adds)],commit=} 0 22908
example_solr_1 exited with code 137
We are running on python 3.9 and a Mac OS. Although it shouldn't be a problem as we are running a docker container.
I appreciate your help :)
Also having an issue running the example on ubuntu 18.04, python 3.6
./ingest.py
Indexing BNL/L'Union articles
Downloading missing BNL/L'Union issues to data/bnl_lunion
01000/41446Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "./ingest.py", line 219, in index_documents
resp = request.urlopen(req)
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Server Error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/concurrent/futures/process.py", line 178, in _process_worker
result_queue.put(_ResultItem(call_item.work_id, exception=exc))
File "/usr/lib/python3.6/multiprocessing/queues.py", line 341, in put
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: cannot serialize '_io.BufferedReader' object
Traceback (most recent call last):
File "./ingest.py", line 241, in <module>
futs.append(pool.submit(index_documents, batch))
File "/usr/lib/python3.6/concurrent/futures/process.py", line 452, in submit
raise BrokenProcessPool('A child process terminated '
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
Thank you for reporting and sorry about that! Will be fixed ASAP.
So I just tried to reproduce both these issues locally but had no luck (or misfortune, as you will), the ingest works without problems and I can query the index with the included web interface.
- Are you both following the instructions in the
example/README.md
or do you have customizations? - Can you check (e.g. with
curl
and/or the browser) if you can access Solr from the environment you're running theingest.py
script in? - Is there anything suspicious in the Solr log (
docker-compose logs
)? Especially @mustard123, it'd be great to know what causes the 500 on the Solr side. - Do you have sufficient disk space available? The example needs at least 16GiB to store the documents and the index
@jbaiter sorry for the late response and thanks for your follow-up. I have enough disk space (more than 80GB), the docker-compose logs
shows the follwing:
Start up seems ok I guess:
Starting example_solr_1 ... done
Starting example_iiif-prezi_1 ... done
Starting example_frontend_1 ... done
Attaching to example_iiif-prezi_1, example_frontend_1, example_solr_1
iiif-prezi_1 | [2021-06-05 16:12:45 +0000] [1] [INFO] Goin' Fast @ http://0.0.0.0:8008
frontend_1 | /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
frontend_1 | /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
iiif-prezi_1 | [2021-06-05 16:12:45 +0000] [1] [INFO] Starting worker [1]
solr_1 | Executing /opt/docker-solr/scripts/solr-precreate ocr /opt/core-config
frontend_1 | /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
solr_1 | Executing /opt/docker-solr/scripts/precreate-core ocr /opt/core-config
solr_1 | Core ocr already exists
frontend_1 | 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
solr_1 | Starting Solr
solr_1 | The currently defined JAVA_HOME (/usr/local/openjdk-11) refers to a location
solr_1 | where java was found but jstack was not found. Continuing.
frontend_1 | 10-listen-on-ipv6-by-default.sh: info: /etc/nginx/conf.d/default.conf differs from the packaged version
frontend_1 | /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
frontend_1 | /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
frontend_1 | /docker-entrypoint.sh: Configuration complete; ready for start up
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: using the "epoll" event method
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: nginx/1.21.0
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1)
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: OS: Linux 5.4.0-73-generic
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1024:4096
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker processes
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 31
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 32
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 33
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 34
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 35
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 36
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 37
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 38
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 39
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 40
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 41
frontend_1 | 2021/06/05 16:12:44 [notice] 1#1: start worker process 42
solr_1 | *** [WARN] *** Your open file limit is currently 1024.
solr_1 | It should be set to 65000 to avoid operational disruption.
solr_1 | If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
solr_1 | *** [WARN] *** Your Max Processes Limit is currently 62247.
solr_1 | It should be set to 65000 to avoid operational disruption.
solr_1 | If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
solr_1 | OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 1)
solr_1 | OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
solr_1 | OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
solr_1 | OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
solr_1 | OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
solr_1 | OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
solr_1 | Listening for transport dt_socket at address: 1044
solr_1 | 2021-06-05 16:12:46.369 INFO (main) [ ] o.e.j.u.log Logging initialized @776ms to org.eclipse.jetty.util.log.Slf4jLog
solr_1 | 2021-06-05 16:12:46.423 WARN (main) [ ] o.e.j.x.XmlConfiguration Ignored arg:
solr_1 |
solr_1 | solr.jetty
solr_1 |
solr_1 |
solr_1 | 2021-06-05 16:12:46.503 INFO (main) [ ] o.e.j.s.Server jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 11.0.11+9
solr_1 | 2021-06-05 16:12:46.519 INFO (main) [ ] o.e.j.d.p.ScanningAppProvider Deployment monitor [file:///opt/solr-8.7.0/server/contexts/] at interval 0
solr_1 | 2021-06-05 16:12:46.743 INFO (main) [ ] o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
solr_1 | 2021-06-05 16:12:46.751 INFO (main) [ ] o.e.j.s.session DefaultSessionIdManager workerName=node0
solr_1 | 2021-06-05 16:12:46.751 INFO (main) [ ] o.e.j.s.session No SessionScavenger set, using defaults
solr_1 | 2021-06-05 16:12:46.752 INFO (main) [ ] o.e.j.s.session node0 Scavenging every 660000ms
solr_1 | 2021-06-05 16:12:46.790 INFO (main) [ ] o.a.s.s.SolrDispatchFilter Using logger factory org.apache.logging.slf4j.Log4jLoggerFactory
solr_1 | 2021-06-05 16:12:46.794 INFO (main) [ ] o.a.s.s.SolrDispatchFilter ___ _ Welcome to Apache Solr™ version 8.7.0
solr_1 | 2021-06-05 16:12:46.794 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / | | | _ Starting in standalone mode on port 8983
solr_1 | 2021-06-05 16:12:46.794 INFO (main) [ ] o.a.s.s.SolrDispatchFilter _ / _ \ | '| Install dir: /opt/solr
solr_1 | 2021-06-05 16:12:46.794 INFO (main) [ ] o.a.s.s.SolrDispatchFilter |/__/|| Start time: 2021-06-05T16:12:46.794677Z
solr_1 | 2021-06-05 16:12:46.813 INFO (main) [ ] o.a.s.c.SolrPaths Using system property solr.solr.home: /var/solr/data
solr_1 | 2021-06-05 16:12:46.817 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading container configuration from /var/solr/data/solr.xml
solr_1 | 2021-06-05 16:12:46.877 INFO (main) [ ] o.a.s.c.SolrXmlConfig MBean server found: com.sun.jmx.mbeanserver.JmxMBeanServer@4f2410ac, but no JMX reporters were configured - adding default JMX reporter.
solr_1 | 2021-06-05 16:12:47.446 INFO (main) [ ] o.a.s.h.c.HttpShardHandlerFactory Host whitelist initialized: WhitelistHostChecker [whitelistHosts=null, whitelistHostCheckingEnabled=true]
solr_1 | 2021-06-05 16:12:47.548 WARN (main) [ ] o.e.j.u.s.S.config Trusting all certificates configured for Client@732c9b5c[provider=null,keyStore=null,trustStore=null]
solr_1 | 2021-06-05 16:12:47.549 WARN (main) [ ] o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@732c9b5c[provider=null,keyStore=null,trustStore=null]
solr_1 | 2021-06-05 16:12:47.650 WARN (main) [ ] o.e.j.u.s.S.config Trusting all certificates configured for Client@2e51d054[provider=null,keyStore=null,trustStore=null]
solr_1 | 2021-06-05 16:12:47.650 WARN (main) [ ] o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@2e51d054[provider=null,keyStore=null,trustStore=null]
solr_1 | 2021-06-05 16:12:47.683 WARN (main) [ ] o.a.s.c.CoreContainer Not all security plugins configured! authentication=disabled authorization=disabled. Solr is only as secure as you make it. Consider configuring authentication/authorization before exposing Solr to users internal or external. See https://s.apache.org/solrsecurity for more info
solr_1 | 2021-06-05 16:12:47.783 INFO (main) [ ] o.a.s.c.TransientSolrCoreCacheDefault Allocating transient cache for 2147483647 transient cores
solr_1 | 2021-06-05 16:12:47.785 INFO (main) [ ] o.a.s.h.a.MetricsHistoryHandler No .system collection, keeping metrics history in memory.
solr_1 | 2021-06-05 16:12:47.838 INFO (main) [ ] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.node' (registry 'solr.node') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@4f2410ac
solr_1 | 2021-06-05 16:12:47.838 INFO (main) [ ] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jvm' (registry 'solr.jvm') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@4f2410ac
solr_1 | 2021-06-05 16:12:47.843 INFO (main) [ ] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jetty' (registry 'solr.jetty') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@4f2410ac
solr_1 | 2021-06-05 16:12:47.863 INFO (main) [ ] o.a.s.c.CorePropertiesLocator Found 1 core definitions underneath /var/solr/data
solr_1 | 2021-06-05 16:12:47.863 INFO (main) [ ] o.a.s.c.CorePropertiesLocator Cores are: [ocr]
solr_1 | 2021-06-05 16:12:47.870 ERROR (coreContainerWorkExecutor-2-thread-1) [ ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup => java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: Unable to create core [ocr]
solr_1 | at java.base/java.util.concurrent.FutureTask.report(Unknown Source)
solr_1 | java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: Unable to create core [ocr]
solr_1 | at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:?]
solr_1 | at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.lambda$load$15(CoreContainer.java:881) ~[?:?]
solr_1 | at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) ~[metrics-core-4.1.5.jar:4.1.5]
solr_1 | at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
solr_1 | at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
solr_1 | at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218) ~[?:?]
solr_1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
solr_1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
solr_1 | at java.lang.Thread.run(Unknown Source) [?:?]
solr_1 | Caused by: org.apache.solr.common.SolrException: Unable to create core [ocr]
solr_1 | at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1420) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.lambda$load$14(CoreContainer.java:852) ~[?:?]
solr_1 | at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202) ~[metrics-core-4.1.5.jar:4.1.5]
solr_1 | ... 5 more
solr_1 | Caused by: org.apache.solr.common.SolrException: Could not load conf for core ocr: Error loading solr config from /var/solr/data/ocr/conf/solrconfig.xml
solr_1 | at org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:88) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1393) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.lambda$load$14(CoreContainer.java:852) ~[?:?]
solr_1 | at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202) ~[metrics-core-4.1.5.jar:4.1.5]
solr_1 | ... 5 more
solr_1 | Caused by: org.apache.solr.common.SolrException: Error loading solr config from /var/solr/data/ocr/conf/solrconfig.xml
solr_1 | at org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:159) ~[?:?]
solr_1 | at org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:111) ~[?:?]
solr_1 | at org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:83) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1393) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.lambda$load$14(CoreContainer.java:852) ~[?:?]
solr_1 | at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202) ~[metrics-core-4.1.5.jar:4.1.5]
solr_1 | ... 5 more
solr_1 | Caused by: org.apache.solr.core.SolrResourceNotFoundException: Can't find resource 'solrconfig.xml' in classpath or '/var/solr/data/ocr'
solr_1 | at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:388) ~[?:?]
solr_1 | at org.apache.solr.core.XmlConfigFile.(XmlConfigFile.java:124) ~[?:?]
solr_1 | at org.apache.solr.core.SolrConfig.(SolrConfig.java:175) ~[?:?]
solr_1 | at org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:151) ~[?:?]
solr_1 | at org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:111) ~[?:?]
solr_1 | at org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:83) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1393) ~[?:?]
solr_1 | at org.apache.solr.core.CoreContainer.lambda$load$14(CoreContainer.java:852) ~[?:?]
solr_1 | at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202) ~[metrics-core-4.1.5.jar:4.1.5]
solr_1 | ... 5 more
solr_1 | 2021-06-05 16:12:47.915 INFO (main) [ ] o.e.j.s.h.ContextHandler Started o.e.j.w.WebAppContext@1cd201a8{/solr,file:///opt/solr-8.7.0/server/solr-webapp/webapp/,AVAILABLE}{/opt/solr-8.7.0/server/solr-webapp/webapp}
solr_1 | 2021-06-05 16:12:47.924 INFO (main) [ ] o.e.j.s.AbstractConnector Started ServerConnector@e84a8e1{HTTP/1.1, (http/1.1, h2c)}{0.0.0.0:8983}
solr_1 | 2021-06-05 16:12:47.924 INFO (main) [ ] o.e.j.s.Server Started @2331ms
But as soon as i run ./ingest.py
the follwing gets logged:
solr_1 | 2021-06-05 16:07:35.730 ERROR (qtp532048323-23) [ ] o.a.s.s.HttpSolrCall null:org.apache.solr.core.SolrCoreInitializationException: SolrCore 'ocr' is not available due to init failure: Could not load conf for core ocr: Error loading solr config from /var/solr/data/ocr/conf/solrconfig.xml
solr_1 | at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1898)
solr_1 | at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1871)
solr_1 | at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:258)
solr_1 | at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
solr_1 | at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)
solr_1 | at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
solr_1 | at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)
solr_1 | at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
solr_1 | at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
solr_1 | at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)
solr_1 | at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
solr_1 | at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
solr_1 | at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
solr_1 | at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
solr_1 | at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
solr_1 | at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
solr_1 | at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
solr_1 | at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
solr_1 | at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
solr_1 | at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)
solr_1 | at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
solr_1 | at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
solr_1 | at org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
solr_1 | at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
solr_1 | at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
solr_1 | at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
solr_1 | at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
solr_1 | at org.eclipse.jetty.server.Server.handle(Server.java:500)
solr_1 | at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
solr_1 | at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
solr_1 | at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
solr_1 | at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
solr_1 | at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
solr_1 | at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
solr_1 | at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
solr_1 | at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
solr_1 | at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
solr_1 | at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
solr_1 | at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
solr_1 | at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
solr_1 | at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
solr_1 | at java.base/java.lang.Thread.run(Unknown Source)
solr_1 | Caused by: org.apache.solr.common.SolrException: Could not load conf for core ocr: Error loading solr config from /var/solr/data/ocr/conf/solrconfig.xml
solr_1 | at org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:88)
solr_1 | at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1393)
solr_1 | at org.apache.solr.core.CoreContainer.lambda$load$14(CoreContainer.java:852)
solr_1 | at com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
solr_1 | at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
solr_1 | at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
solr_1 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
solr_1 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
solr_1 | ... 1 more
solr_1 | Caused by: org.apache.solr.common.SolrException: Error loading solr config from /var/solr/data/ocr/conf/solrconfig.xml
solr_1 | at org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:159)
solr_1 | at org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:111)
solr_1 | at org.apache.solr.core.ConfigSetService.loadConfigSet(ConfigSetService.java:83)
solr_1 | ... 8 more
solr_1 | Caused by: org.apache.solr.core.SolrResourceNotFoundException: Can't find resource 'solrconfig.xml' in classpath or '/var/solr/data/ocr'
solr_1 | at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:388)
solr_1 | at org.apache.solr.core.XmlConfigFile.(XmlConfigFile.java:124)
solr_1 | at org.apache.solr.core.SolrConfig.(SolrConfig.java:175)
solr_1 | at org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:151)
solr_1 | ... 10 more
solr_1 |
@javiermanzano hi, I think your issue is not with OCR Highlight plugin but with your Solr Core initialization. the OCR
core is not being created/started because /var/solr/data/ocr/conf/solrconfig.xml
is not being found (or may be incorrect). How are you mounting inside your Docker Container the initial configuration/conf and data for Solr to create/initialize the Core?
Your logs show that initially it "finds" the core but that may be just the base folder, but fails because of missing conf/*. or you may even have permission issues (on Linux the user/gid needs to be the same as the default port: 8983)
https://github.com/docker-solr/docker-solr#running-solr-with-host-mounted-directories
If you are running this via docker-compose there is info in that GitHub readme too.
Here is an example Docker-compose snippet:
solr:
container_name: your-solr
restart: always
image: "solr:8.8.2"
tty: true
ports:
- "8983:8983"
networks:
- host-net
- internal-net
volumes:
- ${PWD}/persistent/solrcore:/var/solr/data:cached
- ${PWD}/persistent/solrconfig:/ocrconfig:cached
- ${PWD}/persistent/solrlib:/opt/solr/contrib/ocrhighlight/lib:cached
entrypoint:
- docker-entrypoint.sh
- solr-precreate
- ocr
- /ocrconfig
# see https://hub.docker.com/_/mysql/
This snipped
- will read the Core Config from
${PWD}/persistent/solrconfig
and mount it as/ocrconfig
inside the container - Precreate (if it does not exist) a core named
ocr
from the Core Config found in/ocrconfig
- And store the result/created Core/mount shared back to the host on
${PWD}/persistent/solrcore
- Load the OCR
solr-ocrhighlighting
plugin from${PWD}/persistent/solrlib
and mount inside /opt/solr/contrib/ocrhighlight/lib for Solr to find it.
What @DiegoPino said is correct, the 500 is caused by a missing Solr configuration. Are you running the example as described in the README
?
Maybe try these steps to tear down any existing containers and rebuild them from scratch:
$ cd ./example
$ docker-compose down -v
$ docker-compose up --build --force-recreate
Let me know if this helps!
Closing this for inactivity, will reopen when there have been updates.