/annotation-platform

An annotation platform for NLP using open-source software

Primary LanguagePythonApache License 2.0Apache-2.0

Annotation platform for data-centric NLP

Actions Status Code style: black Imports: isort pylint

The Annotation Platform is an open-source project aimed at providing a scalable and flexible annotation platform using Argilla as the annotation layer. The core idea of the project is to provide a simple and intuitive user interface for annotating on any dataset efficiently.

The platform has two APIs, one for the ingestion layer, and another for the serving layer, to simplify the integration process with other applications. The ingestion API is used to upload data and annotations to the platform, while the serving API is used to retrieve annotations for processed data.

The goal of the Annotation Platform is to make it highly scalable and easy to maintain. Please note that this is still in development, and there may be some limitations or issues as we continue to refine in following versions.

The rough sketch for v1 is in the following figure:

Code style: black

Ingestion API

This API is designed to simplify the process of posting text to the Argilla server, and is built using the powerful FastAPI framework. Our API also includes a user-friendly HTML interface, making it easy to post text and get started with our annotation platform.

Serving API

This API is designed to provide a fast and efficient way to access the results of our annotation platform, and is built using the powerful FastAPI framework.

Argilla

I chose to use argilla because I really believe in the project and would like to see if I find any limitations, there are other project such as Doccano that could be an interesting tool to use. I found that one limitation (until version 1.6.0) is that you would need to use an yaml file to define users and it was pretty cumbersome. However, it was fixed in v1.6.0 by adding a way to store users in a database with an API provided by argilla.

Running the project

You can run the project using Make and Docker, or just take a look on the makefiles to run the commands. To start the whole platform you can use:

make start-annotation-platform

Ingestion API

If you want to start the ingestion api server, you can run:

make build-ingestion-api-docker
make start-ingestion-api

Serving API

If you want to start the serving api server, you can run:

make build-serving-api-docker
make start-serving-api

Argilla

If you want to start the argilla server, you can run:

make start-argilla