upbit/pixivpy

"'NoneType' object is not iterable" for illust_ranking(**next_qs, req_auth=True)

tudubucket opened this issue · 15 comments

Hi, I'm currently use this package for automatically download images from pixiv, down below is a part of my code:

downloaded = 0
logger.info("Starting ...")
stop_check = True
for entry in entries:
    next_qs = {"mode": f"{str(entry).replace('weekly_r18', 'week_r18').replace('daily_r18', 'day_r18')}"}
    i = 0
    while next_qs:
        try:
            if i > 3: break
            i = i + 1
            try:
                json_result = api.illust_ranking(**next_qs, req_auth=True)
            except Exception as e:
                logger.error(f"An error occurred while loading result for {entry}: {str(e)}")
                traceback.print_exc()
                time.sleep(5)
                refresh(PIXIV_REFRESH_TOKEN)
                continue
            for illust in json_result.illusts:
                if 'manga' in str(illust.tags): continue
                if illust.type == 'illust' and illust.page_count <= 15:
                    if str(illust.page_count) == '1':
                        try:
                            status = None
                            if 'r18' in entry:   status = api.download(path=f"art/r18/{str(entry).replace('_r18', '')}/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            if entry == 'day':   status = api.download(path=f"art/regular/daily/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            if entry == 'week':  status = api.download(path=f"art/regular/weekly/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            if entry == 'month': status = api.download(path=f"art/regular/monthly/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            # api.download(path="art/author_avatar", url=illust.user.profile_image_urls.medium, name=f'{illust.user.id}.jpg')
                            if status == True: 
                                logger.info(f"Successful downloaded image: {entry} {illust.id}")
                                downloaded += 1
                        except Exception as e:
                            traceback.print_exc()
                            logger.error(f"An error occurred in {entry} ({str(entry).replace('weekly_r18', 'week_r18').replace('daily_r18', 'day_r18')}, 1 page): " + str(e))
                            continue
                    else:
                        try:
                            download_path = ''
                            if 'r18' in entry:   
                                if illust.page_count > 3: continue
                                download_path = f"art/r18/{str(entry).replace('_r18', '')}/"
                            if entry == 'day':   download_path = "art/regular/daily/"
                            if entry == 'week':  download_path = "art/regular/weekly/"
                            if entry == 'month': download_path = "art/regular/monthly/"
                            counter = 1
                            for image in illust.meta_pages:
                                status = None
                                status = api.download(path=download_path, url=image.image_urls.original, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}---{counter}.jpg'))
                                if status == True: 
                                    logger.info(f"Successful downloaded image: {entry} {illust.id} | Page {counter}")
                                    downloaded += 1
                                counter += 1
                        except Exception as e:
                            logger.error(f"An error occurred in {entry} ({str(entry).replace('weekly_r18', 'week_r18').replace('daily_r18', 'day_r18')}, {illust.page_count} pages): " + str(e))
                            traceback.print_exc()
                            continue
            if json_result.next_url is not None:
                next_qs = api.parse_qs(json_result.next_url)
            else: break
        except Exception as e:
            logger.error(f"An error occurred while loading page for {entry}: {str(e)}")
            traceback.print_exc()
            time.sleep(5)
            refresh(PIXIV_REFRESH_TOKEN)
            continue
logger.info(f'Task completed download {downloaded} illustration.')

but these is a problem: Sometime, the illust_ranking(**next_qs, req_auth=True) return None, or someting, that my code cannot working with:

image

I am wondering, is that I've caused a rate-limit from pixiv, pr someing I've done wrong?

Xdynix commented

It's hard to say what is the exact reason without seeing the whole response content. You can try to add more logging statements, especially printing out the json_response. I don't see any explicit concern on your code, I would use the same way to fetch the illlustrations.

Thanks for your really fast response! I've searched about same error in this repo, but none found. So, your response seems like promise that the code will always work as expected without error. This is actually good, cause i was thinking about rate-limit and imagine no solution to replace this is just so sad.

I will look into my code and get more information when it return an error.

Actually, my code is running in a specific time of every day, usually 10PM UTC. When i modified the code to make it run right after my server start, it will actually work again. If you can, please hold this issue for 1 - 2 days, I will get more information if possible. If there is a problem in my code, I will post it here with a solution and close this issue.

Anyway, I did all of this on a virtual private server, will this affect any part of the code?

Xdynix commented

Using a VPS should make no difference. Unless the IP of the data center is blocked by Pixiv. But in this case you should always get an error response, and it doesn't look like tho.

I'm back after 1 day of looking for error, this is what it is telling me:

image

I have a refresh token function in main function start & command exception:

def refresh(refresh_token):
    response = requests.post(
        AUTH_TOKEN_URL,
        data={
            "client_id": CLIENT_ID,
            "client_secret": CLIENT_SECRET,
            "grant_type": "refresh_token",
            "include_policy": "true",
            "refresh_token": refresh_token,
        },
        headers={"User-Agent": USER_AGENT},
    )
    data = response.json()
    try:
        # a = data.get("expires_in", 0)
        logger.info(f'Successfully renew pixiv token with {data.get("expires_in", 0)} seconds remaining.')
    except KeyError:
        logger.warning(f'Unable to renew pixiv token...')

Any idea about this?

Xdynix commented

In your refresh() I didn't see you pass the refreshed access token back to the API instance, so it will probably still using the expired one. You can use api.auth() (without any argument) to trigger the refersh, instead of writting it by yourself.

TL;DR: Replace refresh(PIXIV_REFRESH_TOKEN) with api.auth().


Personally, I would record the expiration time of the current access token after each authentication (api.auth()). Then before each request is sent, check whether the access token is close to expiration (eg, within 2 minutes), and if so, refresh the access token first. This way I don't need to wait until I encounter an error to refresh the access token.

Thanks for your response! I will list everything i've changed in my code, to make sure that i didn't make anything wrong:

  • Still call an auth instance with token in start:
api = AppPixivAPI()
api.auth(refresh_token=PIXIV_REFRESH_TOKEN)
  • Remove refresh() function, intsead making an api.auth() (no argument)
def main():
    global stop_check
    while True:
        try:
            if datetime.now().minute == 0 and datetime.now().hour == 3:
                stop_check = False
            if datetime.now().minute == 1 and datetime.now().hour == 3 and stop_check is False:
                stop_check = True
                downloaded = 0
                logger.info("Starting ...")
                api.auth()   # call the auth instance on start to make sure the token works
                json_result = api.illust_ranking(**next_qs, req_auth=True)   # req_auth arg can be removed cuz default is true
                # more stuff here, like what i've paste in the 1st comment
        except Exception as e:
            logger.error(f"An error occurred in main function: {str(e)}")
            time.sleep(5)
            api.auth()    # api.auth() intead of refresh(token) in every exception, im so lazy to do like your
        time.sleep(1)

I will try this out. If everything work fine, i will close this issue myself. Ortherwise, i will call you again :D

proxies?

upbit commented

I'm back after 1 day of looking for error, this is what it is telling me:

! [image] (https:user-images.githubusercontent.com/106295287/263436737-5e190541-2b69-4cf8-9649-ee2889289be2.png)

From the return JSON, it seems that the Access Token expires after execution. Generating a new bearer token using auth(refresh_token=LAST_TOKEN) should solve the issue.

Thanks for your response, but how do i get the last token?

upbit commented

See #158 (comment) to get your refresh_token, save it into file or a const like demo.py.

refresh_token can be used for a long period of time, and it rarely needs to be updated.

Xdynix commented

Thanks for your response, but how do i get the last token?

You can also use last_refresh_token = api.refresh_token. But calling api.auth() will automatically do the same things (use last time's refresh token to get new access token).

Kinda confused here, but ima try out api.auth() first xD

Seems like call an api.auth() in every download instance solved this problem. But i will look after it for about more 24 hours to make sure it will automatically renew token. If the problem is no more exist after that time, I will close this issue

Down below is a download instance that autoamtically renew the token following by above fix - api.auth(), no more errors:
image

setting proxies to bypass cloudfare may solve the auth problem.

Call an api.auth() in every download instance solved this problem, tested in 2 days.