Anduin is a lightweight and concise tool to process RDF/N-Quads as well as RDF/NTriples formatted data using Hadoop. Anduin is written in Scala and built atop Scalding, a library from Twitter.
- Support of RDF/N-Quads and RDF/NTriples formats
- Tolerant to ill-formed RDF data
- Gathering entity type statistics
- Building adjacency matrices
- Aggregating entity descriptions (e.g. for entity search)
There is no support of blank nodes at the moment.
- Java 1.6+
- Scala 2.9.2+
- tested on Apache Hadoop 1.1 as well as Amazon Web Services Elastic MapReduce
Have a question or a suggestion? Please join our mailing list.
Anduin has been developed by Nikita Zhiltsov. To add new functionality or fix existing bugs, feel free to contribute the patches via pull requests into the develop branch.
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0