duosecurity/duo_client_python

get_authentication_log pagination confusion

guitarhero23 opened this issue · 7 comments

Hi,
I'm having trouble using the SDK to paginate responses as I'd expect and I've dug through the documentation and source and unfortunately it isn't clear to me.

The api documentation https://duo.com/docs/adminapi#logs appears to refer to next_offset as a single number which will offset the results by that number, sort of like a skip that many results and then get the rest. However in my REST request the response gives me {'next_offset': ['1553043904936', 'e32e02c3-a0d6-40bf-b6db-a109edf6d0ea'], 'total_objects': 1231} and that 1553043904936 is an epoch time of the current day.

The api documentation also says

"The offset at which to start record retrieval. This value is provided in the metadata in the form of a date string in milliseconds and the event txid. Both of these values must be provided when used (e.g. next_offset=1547486297000&next_offset=5bea1c1e-612c-4f1d-b310-75fd31385b15)"

but I'm confused as to how to enter that in using the SDK because the source code for the library doesn't appear to reference these fields in the method's description, it only lists the offset. And because i'm using the SDK the actual URL enpoint is kind of abstracted away from me so it's not clear how to add multiple "next_offset=" values in the URL

    "API Version v2:

    mintime (required) - Unix timestamp in ms; fetch records >= mintime
    maxtime (required) - Unix timestamp in ms; fetch records <= mintime
    limit - Number of results to limit to
    next_offset - Used to grab the next set of results from a previous response
    sort - Sort order to be applied
    users - List of user ids to filter on
    groups - List of group ids to filter on
    applications - List of application ids to filter on
    results - List of results to filter to filter on
    reasons - List of reasons to filter to filter on
    factors - List of factors to filter on
    event_types - List of event_types to filter on"

What would be the correct way to get all authentication logs when the total # is 1231?

Hello!

The output of get_authentication_log when using api_version=2 contains two top level keys,authlogs and metadata. The next_offset key in metadata contains a list. If there are no subsequent pages, next_offset should contain something falsey. If there are subsequent pages, the first element is an epoch timestamp and the second is the event id.

When retrieving a page after the initial request, you should be able to just take next_offset contained in the first response and plug that into the next_offset argument of the next get_authentication_log call.

This is untested, but I think it would look something like:

...
authlogs = []
res = client.get_authentication_log(mintime=mintime, maxtime=maxtime, api_version=2)
authlogs.extend(res['authlogs'])
next_offset = res['metadata'].get('next_offset')
while next_offset:
    res = client.get_authentication_log(
        mintime=mintime,
        maxtime=maxtime,
        api_version=2,
        next_offset=next_offset
    )
    authlogs.extend(res['authlogs'])
    next_offset = res['metadata'].get('next_offset')

I hope this helps. Please let me know if this doesn't answer your question.

I was able to use your example as a reference to get it working thanks. I didn't quite need it all for my use case but your next_offset line made it click for me.

`logs = admin_api.get_authentication_log(api_version=api_version, limit=limit, mintime=mintime, sort=sort)

next_offset = logs['metadata'].get('next_offset')

{Stuff gets done here to the logs}

logs = admin_api.get_authentication_log(api_version=api_version, limit=limit, mintime=mintime, sort=sort,next_offset=next_offset
{More Stuff}`

@guitarhero23 @cavemanpi
I am also trying to get data from authentication API but stuck in next_offset part.

Here is my code:-

import base64, email, hmac, hashlib, urllib
import requests
import pprint

class Raheja:

    def sign(self, method, host, path, params, skey, ikey):
        """
        Return HTTP Basic Authentication ("Authorization" and "Date") headers.
        method, host, path: strings from request
        params: dict of request parameters
        skey: secret key
        ikey: integration key
        """
        # create canonical string
        now = email.Utils.formatdate()
        canon = [now, method.upper(), host.lower(), path]
        args = []
        for key in sorted(params.keys()):
            val = params[key]
            if isinstance(val, unicode):
                val = val.encode("utf-8")
            args.append(
                '%s=%s' % (urllib.quote(key, '~'), urllib.quote(val, '~')))
        canon.append('&'.join(args))
        canon = '\n'.join(canon)

        # sign canonical string
        sig = hmac.new(skey, canon, hashlib.sha1)
        auth = '%s:%s' % (ikey, sig.hexdigest())
        print auth

        # return headers
        headers = {'Date': now, 'Authorization': 'Basic %s' % base64.b64encode(auth)}
        return headers

    def get_data(self, headers):
        next_offset = []
        while next_offset is not None:
            url = "https://xxxxx.duosecurity.com/admin/v2/logs/authentication"

            querystring = {'mintime': '1614067200000', 'maxtime': '1614070800000', 'limit': '1000',
                           'next_offset': next_offset}

            headers = {
                'Authorization': headers.get('Authorization'),
                'Content-Type': "application/x-www-form-urlencoded",
                'Date': headers.get('Date'),
            }

            response = requests.request("GET", url, headers=headers, params=querystring)
            response_json = response.json()
            print response_json
            response = response_json.get('response')
            metadata = response.get('metadata')
            next_offset = metadata.get('next_offset')
            print next_offset

 r = Raheja()
headers = r.sign('GET', 'xxxxx.duosecurity.com', '/admin/v2/logs/authentication',
                 {'mintime': '1614067200000', 'maxtime': '1614070800000', 'limit': '1000'},
                 'xxx', 'xxxx')
data = r.get_data(headers)

I am able to get the response from when I did the first API call. But for the second I am trying to pass the offset from my above code but getting as :

{u'message': u'Invalid signature in request credentials', u'code': 40103, u'stat': u'FAIL'}

Please need help here. Thanks

Hey @arahej,

There's a method in the client you should probably use instead of your hand rolled retrieval. The method name is get_authentication_log. Is there a reason you aren't using the client module?

I'm pretty sure you need to sign every request, especially when you change the parameters. The signature is derived from secrets as well as the parameters in the request. When you change the next_offset parameter, the signature will be different.

@cavemanpi
Thanks for the inputs.
Can you share with me the link to the get_authentication_log method?

Also, I need to call the sign method again if I change my next_offset right?
and the data type of next_offset is a list or a string?.

Thanks

@arahej The method you are looking for is in this repo. If your intent is to implement all the logic needed to run your requests, you will have implemented the client in this repo. I strongly suggest using this client in your project.

Here's the method in the client:

def get_authentication_log(self, api_version=1, **kwargs):

Yes, you will need to sign your request each time you change parameters. Ideally also every time you make a call the the api endpoint, it should be signed. If you use the client libarary, you don't have to worry about signing the request correctly; it's built into the client.

On logging endpoints next_offset is a list.