Terrance/SkPy

Error when try to get a content of old file in the skype that is already not available

Closed this issue · 14 comments

Before we start...

  • I've searched existing issues, but my problem hasn't been reported yet.
  • I've read the documentation (including notes on error messages and rate limiting), but my problem is something else.
  • I've tested the behaviour on Skype for Web, and it works there but not with SkPy.

Summary

Is this expected error when I, probably, try to get a content of old file that is not downloadable already but icon is available in skype? Maybe it is connected with another reason. It is hard to understand even what message it is connected, because I should do double catch of SkypeApiException in SkypeApiException. And it appears when I try to save a middle of a lot of messages. Probably it is connected that v1 api is used here?

Code sample

def request_save_message(self, msg, chat_id=None, recipients=None, sender=None, url='skype-message-add',
                             **more_msg_raw):
        sender = sender or msg.userId
        if not recipients:
            chat_users = msg.chat.userIds
            recipients = [userId for userId in chat_users if (userId != sender)]
            recipients = ', '.join(recipients) if isinstance(chat_users, list) and recipients else self.userId

        file_name = msg.file.name if getattr(msg, 'file', None) else None

        def func():
            return {FILE_DICT_KEY: getattr(msg, 'fileContent', None),
                    FILE_NAME_DICT_KEY: file_name}

        try:
            add_data = func()
        except SkypeApiException:
            add_data = {FILE_DICT_KEY: '',
                        FILE_NAME_DICT_KEY: f"'{msg.file.name}' ({EXPIRED_FILE})"}
        except ConnectionError:
            logging.info("Too much requests of file downloading. Waiting for an hour to continue script")
            time.sleep(3600)
            add_data = func()

        return generate_post_req(url, data=dict(chat_id=chat_id, recipients_ids_in_messanger=recipients,
                                                sender_id_in_messanger=sender,
                                                clientId=msg.clientId) | msg.raw | more_msg_raw | add_data)


### Code output

```python
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 537, in _make_request
    response = conn.getresponse()
  File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 466, in getresponse
    httplib_response = super().getresponse()
  File "/usr/local/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/local/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 589, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 470, in increment
    raise reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.10/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 793, in urlopen
    response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 537, in _make_request
    response = conn.getresponse()
  File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 466, in getresponse
    httplib_response = super().getresponse()
  File "/usr/local/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/local/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/code/communications/skype.py", line 75, in request_save_message
    add_data = func()
  File "/code/communications/skype.py", line 71, in func
    return {FILE_DICT_KEY: getattr(msg, 'fileContent', None),
  File "/usr/local/lib/python3.10/site-packages/skpy/util.py", line 227, in wrapper
    cache[key] = fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/skpy/msg.py", line 602, in fileContent
    return self.skype.conn("GET", self.urlContent,
  File "/usr/local/lib/python3.10/site-packages/skpy/conn.py", line 238, in call
    resp = self.sess.request(method, url, headers=headers, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 604, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/code/manage.py", line 24, in <module>
    main()
  File "/code/manage.py", line 20, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.10/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.10/site-packages/django/core/management/base.py", line 413, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.10/site-packages/django/core/management/base.py", line 459, in execute
    output = self.handle(*args, **options)
  File "/code/communications/management/commands/update_messages.py", line 22, in handle
    sk.update_messages_in_db(chat)
  File "/code/communications/skype.py", line 167, in update_messages_in_db
    msgs = self.get_all_messages_of_all_chats(save_to_database=True)
  File "/code/communications/skype.py", line 161, in get_all_messages_of_all_chats
    return [self.get_all_messages_of_chat(chat_obj, save_to_database) for chat_obj in chats_list]
  File "/code/communications/skype.py", line 161, in <listcomp>
    return [self.get_all_messages_of_chat(chat_obj, save_to_database) for chat_obj in chats_list]
  File "/code/communications/skype.py", line 155, in get_all_messages_of_chat
    self.create_or_update_msg_to_db(chat_obj, msg)
  File "/code/communications/skype.py", line 145, in create_or_update_msg_to_db
    return self.request_save_message(msg, chat_obj.id, url=url)
  File "/code/communications/skype.py", line 82, in request_save_message
    add_data = func()
  File "/code/communications/skype.py", line 71, in func
    return {FILE_DICT_KEY: getattr(msg, 'fileContent', None),
  File "/usr/local/lib/python3.10/site-packages/skpy/util.py", line 227, in wrapper
    cache[key] = fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/skpy/msg.py", line 602, in fileContent
    return self.skype.conn("GET", self.urlContent,
  File "/usr/local/lib/python3.10/site-packages/skpy/conn.py", line 249, in call
    raise SkypeApiException("{0} response from {1} {2}".format(resp.status_code, method, url), resp)
skpy.core.SkypeApiException: ('404 response from GET https://weu1-api.asm.skype.com/v1/objects/0-weu-d5-902.../views/original', <Response [404]>)


### Explain your code

It tries to do initial parse of message that is connected to a field specific to skpy Msg objects to save to my database after parse of main body from msg.raw in the future. 

### SkPy version

0.10.7

### Python version

3.10

### Anything else?

_No response_

✔ I've tested the behaviour on Skype for Web, and it works there but not with SkPy.

How does Skype for Web successfully download the file?

✔ I've tested the behaviour on Skype for Web, and it works there but not with SkPy.

How does Skype for Web successfully download the file?

Finally I've integrated Sentry on production and I can see the context of error. So, I can't say anything about the error above, because messages are changing in my software logic. But now I have an error like this in another message:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='cus1-api.asm.skype.com', port=443): Max retries exceeded with url: /v1/objects/0-cus-d...ae0b/views/imgpsh_fullsize (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7fa82c6ae200>: Failed to resolve 'cus1-api.asm.skype.com' ([Errno -2] Name or service not known)"))

So, this time it is an image and looks like that hostname cus1-api.asm.skype.com is incorrect. I can succesfully open in a browser the link https://api.asm.skype.com/v1/objects/0-cus-d...ae0b/views/imgpsh_fullsize after the loggining in my account. Probably the same problems are with files.

Looks like this might be down to the workaround added in 178b3ea a while ago, which may no longer be needed?

If you revert it (or manually make the call that fileContent makes, but to urlFull instead of urlAsm), does that successfully retrieve the file? Does that also successfully retrieve files that do retrieve correctly with the workaround in place?

Looks like this might be down to the workaround added in 178b3ea a while ago, which may no longer be needed?

If you revert it (or manually make the call that fileContent makes, but to urlFull instead of urlAsm), does that successfully retrieve the file? Does that also successfully retrieve files that do retrieve correctly with the workaround in place?

oh, as I see it is old commit. As I understand, just installation of older skpy version is not the way. So, do you offer to create manual method with inheritance?

Anyway, looks like that my host ip is completely blocked by skype and I even can't get chats with error requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) after I've downloaded 80k+ messages during yesterday. And don't have captcha as you say in your docs. Maybe I will able to test just tommorow.

By the way, do you have any good guides of skype limitations to share? It is hard to find any info about it in the internet. I even don't know what timout to set between my requests and attempts quantity.

Looks like this might be down to the workaround added in 178b3ea a while ago, which may no longer be needed?

If you revert it (or manually make the call that fileContent makes, but to urlFull instead of urlAsm), does that successfully retrieve the file? Does that also successfully retrieve files that do retrieve correctly with the workaround in place?

Yes, I've tested, and if I change to urlFull just in this place it works fine for images

SkPy/skpy/msg.py

Lines 594 to 596 in 5b843a7

@property
def urlContent(self):
return "{0}/views/{1}".format(self.file.urlAsm, self.contentPath) if self.file else None

New files are downloaded correctly too. So, looks like you just should correct edit this string.

New files downloaded correctly with unchanged urlAsm too, by the way. UrlFull and UrlThumb begin from api.asm.skype.com always as I understand. But old files are still with urlAsm and are getting by it too.

I've just pushed branch asm-fallback which includes the above fix, relegating the hostname patching added in #46 to a fallback (that is, it will now try the URL Skype gives us first, then try patching the URL if it fails). Does that work for you?

(You can install it directly with pip install git+https://github.com/Terrance/SkPy@asm-fallback.)

pip install git+https://github.com/Terrance/SkPy@asm-fallback

I've tried on development environment on problematic chats. Looks like that everything is fine now. I will try to deploy on production environment. I will say you if something will go wrong on it.

I've just pushed branch asm-fallback which includes the above fix, relegating the hostname patching added in #46 to a fallback (that is, it will now try the URL Skype gives us first, then try patching the URL if it fails). Does that work for you?

(You can install it directly with pip install git+https://github.com/Terrance/SkPy@asm-fallback.)

It was much better! I downloaded more than 170k messages. And looks like, that I was blocked in the past more by the reason that I had a lot of requests to urls that not exist. So, I have errors now with old messages just with urls of files that are not available in skype due to time anymore like this
image_2024-05-27_09-44-47

If Skype is no longer serving the files then there's probably nothing SkPy can do about it.

If Skype is no longer serving the files then there's probably nothing SkPy can do about it.

The problem is just with incorrect link. It shouldn't begin from cus1-api. even for old not existing files. Due to this I have a bit borken logic in my app. I shouldn't have ConnectionError that appears just with links that contain cus1-api.

To clarify, are you saying Skype for Web is serving you old files with a region-prefixed subdomain (i.e. cus1-api.asm.skype.com rather than just api.asm.skype.com) and claiming they're no longer available, but if you change the URL to the non-prefixed subdomain then you can still download them?

To clarify, are you saying Skype for Web is serving you old files with a region-prefixed subdomain (i.e. cus1-api.asm.skype.com rather than just api.asm.skype.com) and claiming they're no longer available, but if you change the URL to the non-prefixed subdomain then you can still download them?

I can't download files anyway and it is ok. I installed your fix git+https://github.com/Terrance/SkPy@asm-fallback and I still have subdomain cus1-api. on the not existing files. I just want to have api.asm.skype.com to get only SkypeApiException 404 error and not ConnectionError. I get unexpected ConnectionError just when I try to get urls that not exist.

@Terrance do you plan to fix it in near time?