Data server for https://data.josqu.in
This repository contains the files for the https://josqu.in website that serves score data files for the Josquin Research Project.
The primary data is stored in Humdrum files. These are initially
linked (or less preferably copied) to the ./kern
directory). The
./Makefile
contains a list of the source location for all Humdrum files to be
added to the ./kern
directory. Adjust the KERNREPOS
variable
in the Makefile to point to directories that contain source Humdrum
files that will be managed by the server.
Then the command:
make kern
Creates symbolic links from the source locations into the ./kern
directory. Once links to Humdrum files are in the ./kern
directory,
the caching process can begin.
This data server is designed to manage digital scores for the Josquin Research Project.
Project | Repository |
---|---|
jrp-scores | https://github.com/josquin-research-project/jrp-scores |
The ./cache
directory stores a copy of the Humdrum data as well as
derivative formats and analyses created from each file. An MD5 checksum
is calculated for each Humdrum file to create an 8-digit ID for that file
to uniquely identify the contents. The Humdrum file is then copied to a
subdirectory named with that ID. All translations and analysis files
related to the Humdrum file are also placed in the same subdirectory.
First, create an index that maps file IDs, century enumerations and SQL enumerations that link to the MD5-derived cache ID:
cd cache
make index
This will read all of the files in ./kern
to create the index
file ./cache/index-new.hmd
.
Next, copy the Humdrum files into the cache with this command:
make copy-kern
This will insert any new Humdrum files into the cache that are not already in the cache.
Next, create derivative files (data translations and analyses) with the following command:
make derivatives
This will create data conversion and pre-compiled analyses for each new Humdrum file.
After derivatives have been created, the new version of the database can be activated by typing the command:
make publish
This will move ./cache/index.hmd
to ./cache/index-old.hmd
and then move
./cache/index-new.hmd
to ./cache/index.hmd
. The file ./cache/index.hmd
is used to map various file identifier systems to the cached version of the
Humdrum file.
Optionally, run the command:
make purge
to remove older and deleted versions of the Humdrum files from the cache
system. These files will be placed in the ./cache/purge
folder for review.
There is also a command to do all of the above steps at one time (except purging) in the base directory:
make update
Or to run the update process in the backgroud using nohup
:
make un
Data and analysis files stored in the cache directory can be accessed on the web via the following example URLs. The primary data server is at https://data.josqu.in .
- https://data.josqu.in/Jos2721 (primary server)
- Return Humdrum data for JRP ID `Jos2721`.
- https://data.josqu.in/Jos2721.krn
- Explicitly request Humdrum data (default behavior if no data format specified).
- https://data.josqu.in/Jos2721format=krn
- Verbose request for Humdrum data using URL parameter.
- https://data.josqu.in/Jos2721?format=kern
- Alternate verbose request for Humdrum data using URL parameter.
- https://data.josqu.in/Jos2721.mei
- Request MEI conversion of file.
- https://data.josqu.in/Jos2721.musicxml
- Request MusicXML conversion of file.
- https://data.josqu.in/Jos2721.xml
- Alternate request MusicXML conversion of file.
- https://data.josqu.in/Jos2721-La_Bernardina.krn
- Full filename can be given (but from and after first `_` in filename will be ignored internally).
- https://data.josqu.in/Jos2721-La_Bernardina.musicxml
- Full filename access to MusicXML conversion.
- https://data.josqu.in/random
- Get a random Humdrum file.
- https://data.josqu.in/random.musicxml
- Get a random MusicXML conversion.
See also https://bit.ly/jrp-maintenance
An example Apache web server configuration is given in cgi-bin/apache.config. The important part of the configuration is:
RewriteEngine On
RewriteRule ^/([^?]*\?(.*))$ /cgi-bin/jrp-data-server?id=$1&$2 [NC,PT,QSA]
RewriteRule ^/([^?]*)$ /cgi-bin/jrp-data-server?id=$1 [NC,PT,QSA]
Header add Access-Control-Allow-Origin "*"
The Header
line is important in order to allow cross-origin access to
the data files.
The rewrite rules are used to simplify the URLs for data access. Access to the data appears as if it were a static file, but the server converts this filename into an id parameter that is passed on to the jrp-data-server CGI script.
The interface between the URL and internal access to data is done with the CGI script cgi-bin/jrp-data-server.pl. Copy this file (via cgi-bin/Makefile) to the location for CGI scripts for the server.
Here is a description of support software needed to create derivatives files for the cache.
Install verovio on the server with these commands:
git clone https://github.com/rism-digital/verovio
cd verovio/tools
./.configure
make
make install
Note that cmake
is required (and must first be installed if not available).
Verify that verovio was installed by running the command:
which verovio
which should reply /usr/local/bin/verovio
.
The aton2json program can be installed with these commands:
npm install -g posix-argv-parser
npm install -g aton2json
wget https://raw.githubusercontent.com/craigsapp/ATON/master/example/cli/aton2json -O /usr/local/bin/aton2json
Verify that aton2json
was installed by running the command:
which aton2json
which should reply /usr/local/bin/aton2json
To allow CGI scripts to run programs in a shell when SELinux, turn on the general permission with the command:
setsebool -P httpd_execmem 1
Then give permissions to run a particular program:
chcon system_u:object_r:httpd_exec_t:s0 program-file
For data access logging, the log directory should be given SELinux permissions:
chcon -R -t httpd_sys_content_t logs
And the web server should be made owner of the log directory.