esgf-dashboard-ip not starting
nathanlcarlson opened this issue · 9 comments
LD_LIBRARY_PATH
Using
/usr/local/esgf-dashboard-ip/bin/ip.service start
Reports the following error
/usr/local/esgf-dashboard-ip/bin/esgf-dashboard-ip: error while loading shared libraries: liblzma.so.5: cannot open shared object file: No such file or directory
liblzma.so.5
is intended to be found in /usr/local/conda/envs/esgf-pub/lib
, but it is not included in the LD_LIBRARY_PATH
in /etc/esg.env
. ip.service
has the following line.
[ -e /etc/esg.env ] && $(grep LD_LIBRARY_PATH /etc/esg.env)
This overwrites any manual attempt to specify the LD_LIBRARY_PATH
with the specification in /etc/esg.env
.
This environment variable should include the conda env lib path above.
Missing Properties
Additionally, after manually resolving the above error, the following error is reported by ip.service start
.
[START] ESGF_HOME attribute not found... setting /esg as default
[START] ESGF_HOME = [/esg/]
[START] //esg//config/esgf.properties
[START] Debug level 2 (1=ERROR, 2=WARNING, 3=DEBUG)
***************************************************
[ERROR][esgf-dashboard-ip.c][567] Please note that 9 DB properties are missing in the esgf.properties file. Please check! Exit
[ERROR][esgf-dashboard-ip.c][568] Mandatory properties are:
[ERROR][esgf-dashboard-ip.c][569] [db.host], [db.database], [db.port], [db.user]
[ERROR][esgf-dashboard-ip.c][570] [esgf.host]
[ERROR][esgf-dashboard-ip.c][572] [esgf.registration.xml.path]
[ERROR][esgf-dashboard-ip.c][573] [esgf.registration.xml.download.url]
[ERROR][esgf-dashboard-ip.c][574] [dashboard.ip.app.home]
[ERROR][esgf-dashboard-ip.c][575] Please check!!!
My initial thoughts are that the 9 DB properties are missing
is misleading. Looking in /esg/config/esgf.properties
it looks like only the following are missing
esgf.registration.xml.path
esgf.registration.xml.download.url
dashboard.ip.app.home
I do not know what these values should be.
PR #616 has been created to begin addressing these issues.
The 9 DB properties are missing
is actually accurate. After looking through the source I found the /esg/config/esgf.properties
parser starting on line 915 (the link should bring you right there). After reading through it some I deduced it is actually not finding the properties because of the white space around the equal sign. Like so
sample.property = value
It will find the property and value if it is instead
sample.property=value
A quick solution to this, as we are using Python's ConfigParser
, would be to set the space_around_delimiters
flag to False. Unfortunately this was released in Python 3. I am going to look at the ESGF homegrown config parser, based on Python's ConfigParser
.
Fortunately, the new features of configparser
in Python 3.6+ have been backported. See https://pypi.org/project/configparser/ . Also, it is already being installed, likely as a dependency of configobj
.
esgf-dashboard-ip
has a number of issues after successfully running /usr/local/esgf-dashboard-ip/bin/ip.service start
with httpd started.
[WARNING][esgf-dashboard-ip.c][589] Please note that 11 non-mandatory properties are missing in the esgf.properties file. Default values have been loaded
./.work/shards.xml:1: parser error : Space required after the Public Identifier
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
^
./.work/shards.xml:1: parser error : SystemLiteral " or ' expected
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
^
./.work/shards.xml:1: parser error : SYSTEM or PUBLIC, the URI is missing
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
^
[xmlreader.c:795] Error: could not parse file ./.work/shards.xml
START PLANA
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
NOTICE: table "cmip5_data_usage_continent_tmp" does not exist, skipping
[WARNING][dbAccess.c][291] Query submission failed [drop table if exists esgf_dashboard.cmip5_data_usage_continent_tmp; create table esgf_dashboard.cmip5_data_usage_continent_tmp as (SELECT EXTRACT (YEAR FROM (TIMESTAMP WITH TIME ZONE 'epoch' + fixed_log.date_fetched * INTERVAL '1 second')) AS year, EXTRACT (MONTH FROM (TIMESTAMP WITH TIME ZONE 'epoch' + fixed_log.date_fetched * INTERVAL '1 second')) AS month, COUNT(*) AS downloads, COUNT(distinct url) AS files, COUNT(distinct user_id_hash) AS users, SUM(fixed_log.size)/1024/1024/1024 AS gb, fixed_log.continent FROM (SELECT cl.continent, file.url, log.user_id_hash, max(log.timestamp) AS date_fetched, max(file.size) AS size FROM esgf_dashboard.dashboard_queue AS log JOIN esgf_dashboard.client_stats_dm AS cl ON (log.remote_addr=cl.ip) JOIN public.file_version AS file ON (UPPER(log.url_path) LIKE '%CMIP5%.NC' AND log.url_path=file.url) WHERE log.success AND log.duration > 1000 GROUP BY file.url, log.user_id_hash, cl.continent) AS fixed_log GROUP BY year, month, continent ORDER BY year, month);]
The error messages go one indefinitely and cannot be stopped with ctrl+c
.
This error is the result of there being no user_id
column in the specified table.
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
This is a bit overwhelming and it may be a good idea to take a step back and re-evaluate this component. It has already consumed a large amount of time to get it to this point. It will likely require a significant amount of time still, based on these new error messages.
It certainly does not fit in being writing in C even though it, seemingly, performs very common tasks that existing Python components perform (db interactions, http requests, parsing xml, parsing config files, reporting information).
It seems there was a version mismatch between the schema defined in esgf-dashboard-ip
and that created by the esgf_dashboard_initialize
script. This was resolved by pulling the correct version from the mirror. It is important to note that the schema migration .egg files had the same name and were labeled the same version, but had different content.
https://aims1.llnl.gov/esgf/dist/2.6/8/esgf-dashboard/esgf_dashboard-0.0.2-py2.7.egg
!=
https://aims1.llnl.gov/esgf/dist/esgf-dashboard/esgf_dashboard-0.0.2-py2.7.egg
#634 Notes some behavior of the esgf-dashboard-ip service that appears to be an issue, but is not. The details of this non-issue are included in that ticket, #634. As #634 details, the esgf-dashboard-ip service operates daily, that is it spends most of its time sleeping (literally sleep(24 hours)
). A test was performed over the weekend to see if the data would update, and it did. This evidence is enough to say, unless something comes up, that the esgf-dashboard-ip service is in working order.