/MioGatto

An annotation tool for grounding of formulae

Primary LanguagePythonMIT LicenseMIT

MioGatto: Math Identifier-oriented Grounding Annotation Tool

System requirements

  • Python3 (3.5 or later)
  • A Web Browser with MathML support (for the GUI annotation system)

Installation

The dependencies will be all installed with one shot:

$ python -m pip install -r requirements.txt

In case you don't want to install the dependencies into your system, please consider utilizing venv.

Project structure

Files in this repository

All the components of MioGatto is included in this repository:

  • lib/ contains the project library.
  • server/ contains the implementation of the server.
  • client/ contains the implementation of the client.
  • tools/ contains our utility Python scripts.

Files not in this repository

On the other hand, the annotation data is not included in this repository due to the NDA constrain for the arXMLiv dataset. The data is licensed to SIGMathLing members as Dataset for Grounding of Formulae. Please consider joining SIGMathLing to acquire the dataset.

  • arxmliv/ contains the original documents from the arXMLiv dataset
  • data/ contains the annotation data
  • sources/ contains the preprocessed documents

Annotator's guide

For the guide with GIF animation, please refer to our Wiki:

Using tools

The Python scripts under the tools directory are mostly for the developers for this dataset. The --help (-h) option is available for all scripts. Detailed documents have not yet prepared.

Preprocess

The basic usage will be shown with:

$ python -m tools.preprocess -h

Analysing the annotation results

For the basic analyses for annotation data, execute:

$ python -m tools.analyzer <paper id>

Some supplemental files including graph images will be saved in the results directory as default.

To calculate agreements between data by two annotators, execute:

$ python -m tools.agreement --target=<path to annotator's data dir> <paper id>

Developing client

The client is developed with TypeScript. All development tools will be installed with:

$ cd client
$ npm install

To compile the client source client/index.ts, execute the following in the client directory:

$ npx tsc

Publications

  • Takuto Asakura, Yusuke Miyao, Akiko Aizawa, Michael Kohlhase. MioGatto: A Math Identifier-oriented Grounding Annotation Tool. In 13th MathUI Workshop at 14th Conference on Intelligent Computer Mathematics (MathUI 2021).
    [preprint] [paper] [slides]
  • Takuto Asakura, André Greiner-Petter, Akiko Aizawa, Yusuke Miyao. Towards Grounding of Formulae. In Proceedings of First Workshop on Scholarly Document Processing (SDP 2020). pp. 138―147, 2020.
    [paper] [bib] [poster] [resource]
  • Takuto Asakura, André Greiner-Petter, Akiko Aizawa, Yusuke Miyao. Dataset Creation for Grounding of Formulae. In SCIDOCA 2020.
    [slides] [resource]

License

Copyright 2021 Takuto ASAKURA (wtsnjp)

This software is licensed under the MIT license.

Third-party software


Takuto ASAKURA (wtsnjp)