mapbox/dynamodb-replicator

Trouble setting up the Lambda Function

AdrianMW opened this issue · 13 comments

Hey,

Thanks for open sourcing this great resource.

I have been trying to set it up but I have the error below when setting the replicator as a Lambda function.

Would you be able to take a look and see if its something obvious :-)

I have manually set up the following Environmental variables like so:
process.env.ReplicaTable = "testTableReplica",
process.env.ReplicaRegion = "us-west-2"
process.env.ReplicaEndpoint = "https://dynamodb.us-west-2.amazonaws.com"

Are there any others I have missed?

I call index.replicate as the function name

I have tried diff-tables us-west-2/testTable us-west-2/testTableReplica --backfill and that worked without a hitch, so I am certain its not a difference in the tables etc

I am looking into using Streambot for real deployment which looks sweet as it removes the configuration from the code entirely. I figured the best way was to get basic example up and running first, then translate that into the streambot js

2015-12-04T09:16:14.357Z    8ff15f9e-ca79-478b-9794-1e528a76d52a    [failed-request] request-id: undefined | id-2: undefined | params:
{
    "RequestItems": {
        "testTableReplica": [
            {
                "PutRequest": {
                    "Item": {
                        "Id": {
                            "S": "ddddddd"
                        }
                    }
                }
            },
            {
                "PutRequest": {
                    "Item": {
                        "Id": {
                            "S": "hjhgfhgfhg"
                        }
                    }
                }
            }
        ]
    }
}

Cheers
Adrian

Hi @AdrianMW -- the error you're seeing represents a failed BatchWriteItem request. The lack of request ids might mean that the aws sdk never successfully contacted the server, which could indicate a configuration problem.

On the other hand there might be a bug that's preventing those ids from displaying. I'm going to be looking at this today, and also making sure that when requests fail, there's an error message printed (not just the parameters of the request that failed).

Thanks, that would be great. I feel I am really close to setting this up but am just missing a little part of the puzzle

See #60 for some logging changes. Also, process.env.ReplicaEndpoint is optional: the sdk will choose the right endpoint for you if you don't specify. Having it in there is primarily used to mock dynamodb for testing.

Thanks, the logging changes really helped. I found the permissions bug I had.

I had a bit of a strange behaviour, where it worked once and then stopped with the following error:
TypeError: object is not a function
at /var/task/index.js:92:13
at /var/task/node_modules/dyno/lib/requests.js:73:7
at notify (/var/task/node_modules/queue-async/queue.js:46:21)
at /var/task/node_modules/queue-async/queue.js:39:16
at Request. (/var/task/node_modules/dyno/lib/requests.js:33:11)
at Request.callListeners (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/var/task/node_modules/aws-sdk/lib/request.js:595:14)
at Request.transition (/var/task/node_modules/aws-sdk/lib/request.js:21:10)
at AcceptorStateMachine.runTo (/var/task/node_modules/aws-sdk/lib/state_machine.js:14:12)

I plan to set up a clean Dynamo DB table and see if it still occurs.

Cheers
Adrian

Interesting. This suggests that there might be an issue with the replicator's handling of unprocessed items that are occasionally returned from BatchWriteItem requests to DynamoDB.

@AdrianMW did you see any log messages like

[retry] attempt 10 contained unprocessed items

Happy new year, I am back on this after the break

I have reset it up with the latest code and new tables but I have encountered the same issue. The first item came through fine.

If I deleted the item from the replicate table, it would keep coming back with each update but no other items came through

I could't see any: [retry] attempt 10 contained unprocessed items messages. Just the error message below repeated many times:

START RequestId: c9b4c8da-fa01-4840-a0d7-7be1dfe46bc5 Version: $LATEST 
2016-01-06T14:35:02.329Z    c9b4c8da-fa01-4840-a0d7-7be1dfe46bc5    TypeError: object is not a function at /var/task/index.js:82:13 at /var/task/node_modules/dyno/lib/requests.js:73:7 at notify (/var/task/node_modules/queue-async/queue.js:46:21) at /var/task/node_modules/queue-async/queue.js:39:16 at Request.<anonymous> (/var/task/node_modules/dyno/lib/requests.js:33:11) at Request.callListeners (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:105:20) at Request.emit (/var/task/node_modules/aws-sdk/lib/sequential_executor.js:77:10) at Request.emit (/var/task/node_modules/aws-sdk/lib/request.js:595:14) at Request.transition (/var/task/node_modules/aws-sdk/lib/request.js:21:10) at AcceptorStateMachine.runTo (/var/task/node_modules/aws-sdk/lib/state_machine.js:14:12) 
END RequestId: c9b4c8da-fa01-4840-a0d7-7be1dfe46bc5 
REPORT RequestId: c9b4c8da-fa01-4840-a0d7-7be1dfe46bc5  Duration: 1219.91 ms    Billed Duration: 1300 ms Memory Size: 128 MB    Max Memory Used: 17 MB   
Process exited before completing request 

The only change I made from the code is manually setting the config in index,js below (and running npm install) before archiving it. The tables are the default setup with the index called id - string.

function replicate(event, callback) {

    process.env.ReplicaTable = "testTableReplica";
    process.env.ReplicaRegion = "us-west-2";

 var replicaConfig = {
        table: process.env.ReplicaTable,
        region: process.env.ReplicaRegion,
        maxRetries: 1000,
        httpOptions: {
            timeout: 750,
            agent: streambot.agent
        }
    };

Cheers
Adrian

I'm having a hard time with the stack trace, trying to figure out what's gone wrong:

at /var/task/node_modules/dyno/lib/requests.js:73:7

This is the source of the error, where the callback variable is an object, not a function. So the question is, how is .sendAll being called in such a way that the callback arg is wrong?

Stepping up the stack...

at /var/task/index.js:92:13

This ought to point somewhere in dynamodb-replicator's index.js, but line 92 definitely is not a .sendAll call. Are there any other adjustments you made to index.js that could be throwing the stack trace off?

It is confusing me too, the final Lambda function I uploaded is here:
https://mwlambdatest.s3.amazonaws.com/Archive.zip

I call index.replicate as the function name

Ok, I got it -- I deployed to my own function and got a different stack trace (at /var/task/index.js:75:30), and this led me to the problem.

Here's the thing that's confusing: This lib was written with the intention that these functions would be wrapped by https://github.com/mapbox/streambot when implemented as Lambda functions -- i.e. a Lambda function's handler would be index.streambotReplicate, not index.replicate.

One of the things that streambot's wrapper does is mess with the arguments that are handed to an actual lambda invocation. Real life lambda you get event, context. When you're finished with your execution, you're supposed to call context.done(). This isn't very node.js-like, so in streambot, it gives you arguments of event, callback, where callback is actually context.done.bind(context). See https://github.com/mapbox/streambot/blob/49fd6dadd38d0ddbd6bc379d0fee702d4cfbbf11/index.js#L27-L57

So what's happening here is that Lambda is calling index.replicate(event, context), and then the function is trying execute context as though it is a simple function.

Since you're already doctoring the file, you should be able to doctor away this problem. In the meantime I'll try and figure out how I want to address this in a PR.

Thanks, I will give Streambot a try.

adri0 commented

I'm having exactly the same problem. Have anyone found a workaround for this issue yet?
Streambot is nice, but it would be nice to make this work without it too.

Closing here in favor of #72