/toloka-kit

Toloka-Kit is a Python library for working with Toloka API.

Primary LanguagePythonOtherNOASSERTION

Toloka-Kit

License PyPI Latest Release Supported Versions Downloads

Coverage GitHub Tests

Toloka website | Documentation | Issue tracker

Toloka-Kit is a Python library for working with Toloka API.

The API allows you to build scalable and fully automated human-in-the-loop ML pipelines, and integrate them into your processes. The toolkit makes integration easier. You can use it with Jupyter notebooks.

  • Support for all common Toloka use cases: creating projects, adding pools, uploading tasks, and so on.
  • Toloka entities are represented as Python classes. You can use them instead of accessing the API using JSON representations.
  • There’s no need to validate JSON files and work with them directly.
  • Support of both synchronous and asynchronous (via async/await) executions.
  • Streaming support: build complex pipelines which send and receive data in real time. For example, you can pass data between two related projects: one for data labeling, and another for its validation.
  • AutoQuality feature which automatically finds the best fitting quality control rules for your project.

Prerequisites

Before you begin, make sure that:

Get Started

  1. Install the Toloka-Kit package. Run the following command in the command shell:
$ pip install toloka-kit

For production environments, specify the exact package version. For the latest stable version, check the project page at pypi.org.

Note: Starting with v1.0.0 release only the core version of the package is installed by default. See the Optional dependencies section for the details.

If you are just starting to use Toloka-Kit, the core package is enough. Our docs explicitly state which features require other packages, so you can install them later if you need them.

  1. Check access to the API with the following Python script. The script imports the package, asks to enter the OAuth token, and requests general information about your account.
import toloka.client as toloka
from getpass import getpass


# Uncomment one of the following two lines to specify where to send requests to: sandbox or production version of Toloka
target = 'SANDBOX'
# target = 'PRODUCTION'

toloka_client = toloka.TolokaClient(getpass("Enter your token:"), target)
print(toloka_client.get_requester())

If the code above has not raised any errors or exceptions, it means that everything works correctly.

  1. Follow our Learn the basics tutorial to learn how to work with Toloka API using Toloka-Kit.

Optional dependencies

Run this command to install toloka-kit with all additional dependencies:

$ pip install toloka-kit[all]

To install specific dependencies, run:

$ pip install toloka-kit[pandas,autoquality,s3,zookeeper,jupyter-metrics] # remove unnecessary requirements from the list

Usage examples

Toloka-kit usage examples - tutorials for specific data labeling tasks. They demonstrate how to work with Toloka API using Toloka-Kit.

Documentation

Support

Contributing

Feel free to contribute to toloka-kit. Right now, we need more usage examples.

License

© Copyright 2023 Toloka team authors. Licensed under the terms of the Apache License, Version 2.0. See LICENSE for more details.