MG-RAST/MG-RAST-Tools

list_all_mg.py is returning an incomplete list

Closed this issue · 1 comments

Expected behavior

The script should return the whole list of with the total number of records for the query.

Actual behavior

The script returns the whole list minus 1000 (which is the initial call limit)
To solve that the first iteration should load the base_url from jsonstructure["url"]
And for the second on load from jsonstructure["next"]

I did the following:

next_url = jsonstructure["next"]

for i in range(0, total_count / limit +1):
    sys.stderr.write("Page {:d}\t".format(i))
    if i == 0:
        base_url = jsonstructure["url"]
        jsonstructure = obj_from_url(base_url)
        printlist(jsonstructure)
        try:
            next_url = jsonstructure["next"]
        except KeyError:
            break
    else:
        base_url = next_url
        jsonstructure = obj_from_url(base_url)
        printlist(jsonstructure)
        try:
            next_url = jsonstructure["next"]
        except KeyError:
            break

Steps to reproduce the behavior

./list_all_mg.py

Yep, that's a bug. Thanks.

After addressing this bug, the total count is still off: I get 52351 records while the API claims
Total number of records: 53608

but now this is because the backend only delivers 52351, not because I'm skipping the first 1000.