Transform MAB-XML to JSON for Elasticsearch indexing with Metafacture.
Prerequisites: Java 8, Maven 3; verify with mvn -version
Create and change into a folder where you want to store the projects:
mkdir ~/git ; cd ~/git
Build the hbz metafacture-core fork:
git clone https://github.com/hbz/metafacture-core.git
cd metafacture-core
mvn clean install -DskipTests
cd ..
Build mabxml-elasticsearch:
git clone https://github.com/hbz/mabxml-elasticsearch.git
cd mabxml-elasticsearch
mvn clean install
See the .travis.yml
file for details on the CI config used by Travis.
Download the sample input data:
cd src/main/resources/input/
wget http://test.lobid.org/assets/data/20140817_20140818.tar.bz2
Go to the project root and run the transformation:
cd ../../../../ ; mvn exec:java -Dexec.mainClass="flow.Transform"
This will index the data in an hbz-internal Elasticsearch instance (only works hbz internally).
You can then GET a specific record by hbz ID:
curl -XGET 'http://quaoar2.hbz-nrw.de:9200/hbz01/mabxml/HT012786619'; echo
You can also exclude the Elasticsearch metadata:
curl -XGET 'http://quaoar2.hbz-nrw.de:9200/hbz01/mabxml/HT012786619/_source'; echo
For details on the various options see the GET API documentation.
Eclipse Public License: http://www.eclipse.org/legal/epl-v10.html