/mabxml-elasticsearch

Raw catalog data exposed via a web API

Primary LanguageJava

About

Transform MAB-XML to JSON for Elasticsearch indexing with Metafacture.

Build

Prerequisites: Java 8, Maven 3; verify with mvn -version

Create and change into a folder where you want to store the projects:

  • mkdir ~/git ; cd ~/git

Build the hbz metafacture-core fork:

  • git clone https://github.com/hbz/metafacture-core.git
  • cd metafacture-core
  • mvn clean install -DskipTests
  • cd ..

Build mabxml-elasticsearch:

  • git clone https://github.com/hbz/mabxml-elasticsearch.git
  • cd mabxml-elasticsearch
  • mvn clean install

See the .travis.yml file for details on the CI config used by Travis.

Data

Download the sample input data:

  • cd src/main/resources/input/
  • wget http://test.lobid.org/assets/data/20140817_20140818.tar.bz2

Go to the project root and run the transformation:

  • cd ../../../../ ; mvn exec:java -Dexec.mainClass="flow.Transform"

This will index the data in an hbz-internal Elasticsearch instance (only works hbz internally).

You can then GET a specific record by hbz ID:

  • curl -XGET 'http://quaoar2.hbz-nrw.de:9200/hbz01/mabxml/HT012786619'; echo

You can also exclude the Elasticsearch metadata:

  • curl -XGET 'http://quaoar2.hbz-nrw.de:9200/hbz01/mabxml/HT012786619/_source'; echo

For details on the various options see the GET API documentation.

License

Eclipse Public License: http://www.eclipse.org/legal/epl-v10.html