dmcquay/node-apac

Automatic Throttling

Hucho opened this issue · 11 comments

Hucho commented

Hallo,

first thank you for the great module...it is working fine.

function runQueries (querryArray){

    for(var i = 0; i < querryArray.length; i++){

    newQuery = {
         'SearchIndex': querryArray[i].SearchIndex,
        'Title': querryArray[i].Title,
        'KeyWords': querryArray[i].KeyWords,
        'MinimumPrice': querryArray[i].MinimumPrice,
         'MaximumPrice': querryArray[i].MaximumPrice,
         'ResponseGroup': querryArray[i].ResponseGroup,
         'sort': querryArray[i].sort
        }

    opHelper.execute ('ItemSearch', newQuery

    , function(err, results) {

        if(err) {console.log(err);}

        else { //save each item in a MongoDB and delete it after server shutdown...

My problem is that the "for loop" runs and executes queries too fast/async while the callback is much slower. Would it be possible to patch the module in a way that i can set a interval parameter in ms.
Do you have another idea to fix the problem?

Thanks

Hucho

First of all, I don't think you need all of those properties. Your code should behave identical if you write it like this:

function runQueries(queryArray) {
    for (var i = 0; i < queryArray.length; i++) {
        opHelper.execute('ItemSearch', queryArray[i], function (err, results) {
            if (err) {
                // handle err
            } else {
                //save each item in a MongoDB and delete it after server shutdown...
            }
        })
    }
}

The best way to tackle this problem is using promises. I would first wrap the opHelper execute method in a promise like this:

function executeQuery(query) {
    return new Promise(function (resolve, reject) {
        opHelper.execute('ItemSearch', newQuery, function (err, results) {
            if (err) return reject(err)
            resolve(results)
        })
    })
}

Then you can implement your function as simply as this:

function runQueries2(queryArray) {
    let promises = queryArray.map(query => executeQuery(query))
    return Promise.all(promises)
}

Keep in mind that since this is now using promises, you'll need to call your function like this:

runQueries2(queryArray).then((results) => {
    // save to mongo or whatever...
}).catch((err) => {
    // console.log or whatever
})

I am in the process of updating this module to provide support for Promises out of box so that you won't have to create your own wrapper like I have done here.

Hucho commented

Hi Dustin,

first, thank you for your response...I was already experimenting with

promises...

Your proposal works and also works not...

The problem of your solution is that runQueries fires all the queries at once.

The Amazon server starts responding the first queries and then sends

errors.

So it would be necessary to embed a timeout with app. 1001 ms in the body of the runqueries

function...

In reality, people are more likely to run querryarray than single hard coded queries.

Thank you

Matthias

Von: Dustin McQuay [mailto:notifications@github.com]
Gesendet: Samstag, 2. April 2016 02:22
An: dmcquay/node-apac
Cc: Hucho
Betreff: Re: [dmcquay/node-apac] Problem with callback if sending request with a loop (#59)

First of all, I don't think you need all of those properties. Your code should behave identical if you write it like this:

function runQueries(queryArray) {
for (var i = 0; i < queryArray.length; i++) {
opHelper.execute('ItemSearch', queryArray[i], function (err, results) {
if (err) {
// handle err
} else {
//save each item in a MongoDB and delete it after server shutdown...
}
})
}
}

The best way to tackle this problem is using promises. I would first wrap the opHelper execute method in a promise like this:

function executeQuery(query) {
return new Promise(function (resolve, reject) {
opHelper.execute('ItemSearch', newQuery, function (err, results) {
if (err) return reject(err)
resolve(results)
})
})
}

Then you can implement your function as simply as this:

function runQueries2(queryArray) {
let promises = queryArray.map(query => executeQuery(query))
return Promise.all(promises)
}

Keep in mind that since this is now using promises, you'll need to call your function like this:

runQueries2(queryArray).then((results) => {
// save to mongo or whatever...
}).catch((err) => {
// console.log or whatever
})

I am in the process of updating this module to provide support for Promises out of box so that you won't have to create your own wrapper like I have done here.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub #59 (comment) https://github.com/notifications/beacon/AM5NrSQz3GGedW7uj076SgEb192hEIR2ks5pzbacgaJpZM4H7M6m.gif

Hucho commented

Hi Dustin,

this works:

function runQueries(){

            var query = querryArray.pop();

            if(query){

                           makeRequest(query).then(function(){

                                           setTimeout(function(){

                                                           runQueries();

                                           }, 3000)

                                           })

                           }

            }

function makeRequest(query){

            return new Promise(function(resolve, reject){

                           opHelper.execute('ItemSearch', query, function(err, result){

                                           if(err) return reject(err)

                                           else {resolve(result)

                                                           console.log(result);}

                                           })

            })

}

runQueries();

Best

Matthias

Von: Dustin McQuay [mailto:notifications@github.com]
Gesendet: Samstag, 2. April 2016 02:22
An: dmcquay/node-apac
Cc: Hucho
Betreff: Re: [dmcquay/node-apac] Problem with callback if sending request with a loop (#59)

Closed #59 #59 .


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub #59 (comment) https://github.com/notifications/beacon/AM5NrSQz3GGedW7uj076SgEb192hEIR2ks5pzbacgaJpZM4H7M6m.gif

Ahhh yes, as I was writing out this code I totally forgot you needed them to run serially. Looks good.

Hucho commented

Hi Dustin,

my app is running now and the code tested...

//batch query all requests from the array and then the result to mongoDB; after do a timeout...

function runQueries(){

            var query = querryArray.pop();

            if(query){

                           makeRequest(query).then(function(results){

                                                           console.log(query)

                                                           /*here is the entry point for the saveData function, which writes to MongoDB

                                                           in my case*/

                                                           saveData(results);

                                           setTimeout(function(){

                                                           runQueries();

                                           }, 1001);

                                           ;

                                           })

                           }

            }

//make a request and resolve the result

function makeRequest(query){

            return new Promise(function(resolve, reject){

                           opHelper.execute('ItemSearch', query, function(err, results){

                                           if(err) return reject(err)

                                           else {resolve(results)}

                                           })

            })

            }

You can use my code for updating your module if you like...

the Amazon server is not complaining if you exactly use 1000 ms + 1;

One other point: It would be great if you exposed a property for setting a locale. In my case

I would like to search on German webservices. As of now, I see only the possibility

to patch your Operationshelper module?

Best

Matthias

Von: Dustin McQuay [mailto:notifications@github.com]
Gesendet: Samstag, 2. April 2016 23:03
An: dmcquay/node-apac
Cc: Hucho
Betreff: Re: [dmcquay/node-apac] Problem with callback if sending request with a loop (#59)

Ahhh yes, as I was writing out this code I totally forgot you needed them to run serially. Looks good.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub #59 (comment) https://github.com/notifications/beacon/AM5NrTzaE4ulh3jb8czW5-JrbqTGsAIaks5pztmdgaJpZM4H7M6m.gif

Yes an automatic throttle would be nice. I will reopen this and consider adding it. Regarding the locale, if you can tell me where that Parma needs to be added, I may be willing to expose that.

And if you do, please do so in a separate issue.

Here's what I'm thinking. This is not tested code. Just working out how I'd like the API to look.

57a031f

I'm not providing a batching-specific function. Rather, I want the throttling to take effect even if you are just running one request at a time. Also, by default, the throttling will be disabled. Per the docs here (http://docs.aws.amazon.com/AWSECommerceService/latest/DG/TroubleshootingApplications.html) the initial rate limit for new clients is 1 per IP per second, but that increases automatically based on revenue attributed to your account. I don't want existing clients to suddenly be unnecessarily throttled by me defaulting to 1 request per second. Also, some clients may run more than one node process per IP address, so they would need to either divide their rate by the number of processes or handle the throttling on their own in some other way.

So...the reason for all that explanation is that I think it needs to be disabled by default and explicitly enabled (and the maxRequestsPerSecond set) by the client.

So, all you need to do to use the throttling would be to set params.maxRequestsPerSecond to some valid integer or float value.

If you wanted to loop through a list of queries, you can do so without worrying about the rate limiting at all. e.g.

for (let query of queryArray) {
    opHelper.execute(query).then((response) => {
        // save to data store
    }).catch((err) => {
        // log error
    })
}

Or if you need to keep track of when they are all done, then you might consider Promise.all to coordinate the batch.

Let me know what you think of this approach.

Another thought. Per https://affiliate-program.amazon.com/gp/advertising/api/detail/faq.html the rate limit allows short bursting. My solution above will prevent you from benefitting from the permitted bursting. Perhaps it would be better to intercept failed requests due to rate limiting and use a backoff policy based on that. That would get more complicated though. Not sure it is worth the complexity.

Hucho commented

Hi Dusting,

I went through your text & code. I think you are considering all important aspects. It might be better not to

offer a batch execution from the scratch. amz might not be happy if it works out of the box. If people

code it themselves...it is another story.

With your new approach, it is easier to switch in batch mode which is great.

Regarding the throttling, I think you cover all aspects; however regarding the short bursting; I think

this does not play an important role. I tested short bursting. The server allows that only for

some milliseconds...the it fires back errors advising you to send your requests at a lower rate.

did you see my post regarding locale?

Best

Matthias

Von: Dustin McQuay [mailto:notifications@github.com]
Gesendet: Sonntag, 3. April 2016 06:39
An: dmcquay/node-apac
Cc: Hucho
Betreff: Re: [dmcquay/node-apac] Problem with callback if sending request with a loop (#59)

Here's what I'm thinking. This is not tested code. Just working out how I'd like the API to look.

57a031f 57a031f

I'm not providing a batching-specific function. Rather, I want the throttling to take effect even if you are just running one request at a time. Also, by default, the throttling will be disabled. Per the docs here (http://docs.aws.amazon.com/AWSECommerceService/latest/DG/TroubleshootingApplications.html) the initial rate limit for new clients is 1 per IP per second, but that increases automatically based on revenue attributed to your account. I don't want existing clients to suddenly be unnecessarily throttled by me defaulting to 1 request per second. Also, some clients may run more than one node process per IP address, so they would need to either divide their rate by the number of processes or handle the throttling on their own in some other way.

So...the reason for all that explanation is that I think it needs to be disabled by default and explicitly enabled (and the maxRequestsPerSecond set) by the client.

So, all you need to do to use the throttling would be to set params.maxRequestsPerSecond to some valid integer or float value.

If you wanted to loop through a list of queries, you can do so without worrying about the rate limiting at all. e.g.

for (let query of queryArray) {
opHelper.execute(query).then((response) => {
// save to data store
}).catch((err) => {
// log error
})
}

Or if you need to keep track of when they are all done, then you might consider Promise.all to coordinate the batch.

Let me know what you think of this approach.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub #59 (comment) https://github.com/notifications/beacon/AM5NrYkEoVGQede83qP7-kmt9kPhtgqzks5pz0RbgaJpZM4H7M6m.gif

Thanks, that's very helpful. And yes I did see the issue for locale. It makes sense. I appreciate all your input!