HumanSignal/label-studio-sdk

project - get_paginated_tasks - default page size does not return all labeled tasks.

jwoutersmagics opened this issue · 6 comments

When querying for all labeled tasks, the number of returned labeled tasks is limited to a maximum of 100 tasks. The default value of page_size=-1 in the get_paginated_tasks method does not seem to have the intended effect, of returning all tasks. When manually increasing the page_size beyond the known number of labeled tasks, the desired number of labeled tasks is obtained. I have attached both the SDK and label-studio version I am developing against.

https://github.com/heartexlabs/label-studio-sdk/blob/e887d186be4bda9afb329c4c3d6b309ff128f38e/label_studio_sdk/project.py#L564

sdk version
label-studio-sdk==0.0.9

label-studio version
{
"release": "1.4",
"label-studio-os-package": {
"version": "1.4",
"short_version": "1.4",
"latest_version_from_pypi": "1.4",
"latest_version_upload_time": "2021-11-19T15:56:02",
"current_version_is_outdated": false
},

"label-studio-os-backend": {
"message": "Change version to rc6",
"commit": "b30e88bb9b2aab94bf2ce53bb677230c6d8263b0",
"date": "2021-11-19 13:02:19 +0300",
"branch": "release/1.4",
"version": "1.3.0+194.gb30e88bb.dirty"
},

"label-studio-frontend": {
"message": "[fix] Add observing of image size for special cases (#340) * [fix] Ad ...",
"commit": "cb2fd37cda67dd456700f95e64947b00319dc8b8",
"branch": "master",
"date": "2021-11-18T15:41:46Z"
},

"dm2": {
"message": "DEV-609: Load task data from server (#22) * Load task data from serve ...",
"commit": "feb9f1db923039b098fd0122f3d6a87bdc224a79",
"branch": "master",
"date": "2021-11-18T20:01:55Z"
},

"label-studio-converter": {
"version": "0.0.36"
}
}

Bump -- please fix -- this seriously undermines the utility of this SDK

This can be worked around with get_paginated_tasks

    task_ids = project.get_paginated_tasks(filters=filters, page=1, page_size=10000000)

We have this solution - #31 - but it will work only with the latest master commits from LS.

Thanks a lot Max! We are currently also using the workaround as described by Timothy. Looking forward to seeing this fix in a release version.

Fix is in the master branch of LS SDK. Also you can use LS SDL 0.0.10 with the latest master of LS.