/Baikal_Dataset

Dataset of Baikal plankton.

Creative Commons Zero v1.0 UniversalCC0-1.0

Baikal plankton image dataset

Dataset contains microscope images of various plankton species from lake Baikal.

Medium article about our work and how dataset is obtained.

Images are obtained from microscope camera with same image background and backlight.

We provide polygonal labeling for object detection, segmentation and classification tasks. Currently, we provide images 72 different species as well as category other for image artefacts, background, etc.

Data format description

File data.json contains list of objects in images.

  • Field image contains direct link for source image to download
  • category contains object class
  • points is a list of polygon coordinates; each point is represented by dict with x and y coordinates of point, each coordinate is in span [0,1] from left and top
{
  "image": "https://storage.yandexcloud.net/baikal-samples/images/d30543d7-4881-4160-8c25-f014dddeffa4-New2/d07053cd-77ef-4410-8b4f-27607b24977f-21-09-14-12-57-27.jpg", 
  "category": "cyclotella minuta", 
  "points": [
    {"x": 0.3097988874625588, "y": 0.9425188988731992},
    {"x": 0.3662815575524176, "y": 0.9425188988731992}, 
    {"x": 0.3662815575524176, "y": 0.9904435886464127}, 
    {"x": 0.3097988874625588, "y": 0.9904435886464127}]}

Changelog

ver [0.0.3] - 2024-02-21

Updated main dataset. Test dataset is now part of main dataset.

ver [0.0.2] - 2023-02-01

Test dataset added

ver [0.0.1] - 2022-09-23

First version of dataset