ecederstrand/exchangelib

Getting HTTP status code 503 while using FindItem to query mail id from inbox of Exchange Server 2013

huier23 opened this issue · 2 comments

Describe the bug
My mailbox is over 5 million mail items and I need to export all the mail as eml file. After using retry policy to handle busy server issue, still getting HTTP status code 503 while using FindItem to query mail id from inbox of Exchange Server 2013. Just want to know how to adjust my code or what is the possible root cause?

Code

    journal_account = connect_to_mailbox(mail_server=MAIL_SERVER, mailbox_name=MAILBOX, mailbox_password=PASSWORD, server_build_number=VERSION, max_wait_time=TIMEOUT)

    if journal_account:
        folder_object = journal_account.inbox
        mailbox_folder_list.append(folder_object)
        
        q = Q(categories__exists=False)
        # Process item in folder
        for mailbox_folder in mailbox_folder_list:
            logger.debug("Fetch mailbox item each foler")
            print(f"Process Folder: {mailbox_folder}")
            st_mailbox_folder = mailbox_folder.name

            if batch_count > 1000:
                pageSize = 1000
            else:
                pageSize = batch_count

            with Manager() as manager:
                summary_counter = manager.dict()
                summary_counter["success"] = 0
                summary_counter["duplicate"] = 0
                summary_counter["error"] = 0
            
                process_tasks = []
                item_id_list = []
                item_list_qs = mailbox_folder.filter(q).only("id")
                item_list_qs.page_size = pageSize

                item_list_iterator = item_list_qs.order_by('datetime_received')[FROM:FROM + COUNT]
                for index, itemsSlice in enumerate(chunkify(item_list_iterator, batch_count)):
                    index = index * batch_count + FROM
                    p = Process(target=batch_mail_processor_by_id, args=(MAIL_SERVER, MAILBOX, PASSWORD, VERSION, TIMEOUT, index, itemsSlice, st_mailbox_folder, summary_counter))
                    p.start()
                    process_tasks.append(p)
                for p in process_tasks:
                    p.join()

Log output

The service would stop at for index, itemsSlice in enumerate(chunkify(item_list_iterator, batch_count)) and return error at the end. Below is the debug log I get,

  • Get ProxyError
Response headers: {'TimeoutException': ProxyError(MaxRetryError("HTTPSConnectionPool(host='mail.tsmcemd.local', port=443): Max retries exceeded with url: /EWS/Exchange.asmx (Caused by ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response')))"))}
2023-05-16 17:08:28 - exchangelib.util.xml - Request XML: b'<?xml version=\'1.0\' encoding=\'utf-8\'?>\n<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types"><s:Header><t:RequestServerVersion Version="Exchange2013_SP1"/><t:TimeZoneContext><t:TimeZoneDefinition Id="Taipei Standard Time"/></t:TimeZoneContext></s:Header><s:Body><m:FindItem Traversal="Shallow"><m:ItemShape><t:BaseShape>IdOnly</t:BaseShape></m:ItemShape><m:IndexedPageItemView MaxEntriesReturned="100" Offset="0" BasePoint="Beginning"/><m:Restriction><t:Not><t:Exists><t:FieldURI FieldURI="item:Categories"/></t:Exists></t:Not></m:Restriction><m:SortOrder><t:FieldOrder Order="Ascending"><t:FieldURI FieldURI="item:DateTimeReceived"/></t:FieldOrder></m:SortOrder><m:ParentFolderIds><t:DistinguishedFolderId Id="inbox"><t:Mailbox><t:EmailAddress>xxxxxxx@xxxxxx.local</t:EmailAddress><t:RoutingType>SMTP</t:RoutingType><t:MailboxType>Mailbox</t:MailboxType></t:Mailbox></t:DistinguishedFolderId></m:ParentFolderIds></m:FindItem></s:Body></s:Envelope>'
Response XML: b''
  • Get response header but status code is 503 and no response XML receive
2023-05-16 17:01:54 - exchangelib.util - Retry: 4
Waited: 160
Timeout: 1200
Session: 50716
Thread: 1064
Auth type: <requests_ntlm.requests_ntlm.HttpNtlmAuth object at 0x0000017884859A10>
URL: https://mail.tsmcemd.local/EWS/Exchange.asmx
HTTP adapter: <requests.adapters.HTTPAdapter object at 0x000001788485BF10>
Streaming: False
Response time: 126.98399999993853
Status code: 503
Request headers: {'User-Agent': 'exchangelib/4.9.0 (python-requests/2.28.1)', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'Keep-Alive', 'Content-Type': 'text/xml; charset=utf-8', 'X-AnchorMailbox': 'xxxxx@xxxxx.local', 'Content-Length': '1107', 'Authorization': 'NTLM xxxxxxxxxx'}
Response headers: {'Cache-Control': 'private', 'Server': 'Microsoft-IIS/10.0', 'request-id': '0cd5e6c0-b183-4a6a-a645-ffa55c0ff5b3', 'X-CalculatedBETarget': 'xxxxx.xxxx.local', 'X-AspNet-Version': '4.0.30319', 'Persistent-Auth': 'true', 'X-Powered-By': 'ASP.NET', 'X-FEServer': 'xxxxxxx', 'Date': 'Tue, 16 May 2023 09:01:53 GMT', 'Content-Length': '0'}
2023-05-16 17:01:54 - exchangelib.util.xml - Request XML: b'<?xml version=\'1.0\' encoding=\'utf-8\'?>\n<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types"><s:Header><t:RequestServerVersion Version="Exchange2013_SP1"/><t:TimeZoneContext><t:TimeZoneDefinition Id="Taipei Standard Time"/></t:TimeZoneContext></s:Header><s:Body><m:FindItem Traversal="Shallow"><m:ItemShape><t:BaseShape>IdOnly</t:BaseShape></m:ItemShape><m:IndexedPageItemView MaxEntriesReturned="100" Offset="0" BasePoint="Beginning"/><m:Restriction><t:Not><t:Exists><t:FieldURI FieldURI="item:Categories"/></t:Exists></t:Not></m:Restriction><m:SortOrder><t:FieldOrder Order="Ascending"><t:FieldURI FieldURI="item:DateTimeReceived"/></t:FieldOrder></m:SortOrder><m:ParentFolderIds><t:DistinguishedFolderId Id="inbox"><t:Mailbox><t:EmailAddress>emdstore20160201@tsmcemd.local</t:EmailAddress><t:RoutingType>SMTP</t:RoutingType><t:MailboxType>Mailbox</t:MailboxType></t:Mailbox></t:DistinguishedFolderId></m:ParentFolderIds></m:FindItem></s:Body></s:Envelope>'
Response XML: b''

The second one is not an error - HTTP status code 503 is retried, as the log output also indicates. See https://github.com/ecederstrand/exchangelib/blob/master/exchangelib/protocol.py#L717 This is just your server erroring out on some HTTP requests for whatever reason.

Regarding the first one (TimeoutException) is looks like you're using an old version of exchangelib. Try upgrading to the latest version. ProxyError is a subclass of ConnectionError which should be handled here:

requests.exceptions.ConnectionError,

If you still see that your ProxyError is unhandled, then you can create a custom FaultTolerance class that overrides the raise_response_errors and re-raises it as an ErrorServerBusy exception. See

def raise_response_errors(self, response):
for an example of that.

Closing due to timeout. Feel free to reopen if you still have issues with this.