airbnb/streamalert

Error copying local state to S3 state (release-3-0-0 branch)

jfrantz1-r7 opened this issue · 7 comments

Background

Description

When trying to deploy streamalert for the first time (Mac OS, python 3.7, terraform version 0.12.21, all requirements installed), terraform gives me an error when copying the local state to remote state.

Error: Error copying state from the previous "local" backend to the newly configured
"s3" backend:
    failed to upload state: AccessDenied: Access Denied
	status code: 403, request id: 360B24A5ED66D093, host id: 8Ng1pEWCVEnO1kEOUACmBs5RlnQLxCJ/Hr3QtZrXyKhj1sCJ+4i9Q/MRYEadSoOKDG7yU3/fTEE=

The state in the previous backend remains intact and unmodified. Please resolve
the error above and try again.


[ERROR 2020-03-01 21:24:12,893 (streamalert_cli.helpers:68)]: An error occurred while running: terraform init
['terraform', 'init', '-force-copy']
[INFO 2020-03-01 21:24:12,893 (streamalert_cli.runner:71)]: Complete

Steps to Reproduce

  1. https://www.streamalert.io/en/release-3-0-0/getting-started.html
  2. `git clone -branch release-3-0-0
  3. Get to the section where you run python3.7 manage.py init
  4. It begins, i type in yes to apply the terraform configuration
Error: Error copying state from the previous "local" backend to the newly configured
"s3" backend:
    failed to upload state: AccessDenied: Access Denied
	status code: 403, request id: 360B24A5ED66D093, host id: 8Ng1pEWCVEnO1kEOUACmBs5RlnQLxCJ/Hr3QtZrXyKhj1sCJ+4i9Q/MRYEadSoOKDG7yU3/fTEE=

The state in the previous backend remains intact and unmodified. Please resolve
the error above and try again.


[ERROR 2020-03-01 21:24:12,893 (streamalert_cli.helpers:68)]: An error occurred while running: terraform init
['terraform', 'init', '-force-copy']
[INFO 2020-03-01 21:24:12,893 (streamalert_cli.runner:71)]: Complete

Attempted troubleshooting

Tried cycling through python3.7 manage.py clean/generate/init, while also removing the old .terraform folder in terraform/.

I did notice that by default, the globals.json file has logging set and the bucket name is Specify Bucket Name Here. I saw this error in the logs:

DEBUG: Validate Response s3/PutBucketLogging failed, attempt 4/25, error InternalError: We encountered an internal error. Please try again.

I used the example in the unit test and changed my config file to be this:

{
  "account": {
    "aws_account_id": "1234567891011",
    "prefix": "prefixtestsalol",
    "region": "us-east-1"
  },
  "general": {
    "matcher_locations": [
      "matchers"
    ],
    "rule_locations": [
      "rules"
    ]
  },
  "infrastructure": {
    "alerts_table": {
      "read_capacity": 5,
      "write_capacity": 5
    },
    "firehose": {
      "use_prefix": true,
      "buffer_interval": 900,
      "buffer_size": 128,
      "compression_format": "GZIP",
      "enabled": false,
      "enabled_logs": {}
    },
    "monitoring": {},
    "rule_staging": {
      "cache_refresh_minutes": 10,
      "enabled": false,
      "table": {
        "read_capacity": 20,
        "write_capacity": 5
      }
    },
  "s3_access_logging": {
    "create_bucket": true,
    "logging_bucket": "prefixtestsalol.streamalert.s3-logging"
  },
  "terraform": {
    "create_bucket": true,
    "tfstate_bucket": "prefixtestsalol.streamalert.terraform.state",
    "tfstate_s3_key": "stream_alert_state/terraform.tfstate"
  },
    "classifier_sqs": {
      "use_prefix": true
    }
  }
}

Bumped up the TF_LOG to DEBUG and here is what I see:

2020/03/01 21:24:12 [DEBUG] [aws-sdk-go] DEBUG: Response s3/PutObject Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 403 Forbidden
Connection: close
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Mon, 02 Mar 2020 03:24:11 GMT
Server: AmazonS3
X-Amz-Id-2: randomidhere
X-Amz-Request-Id: randomidhere

I ensured that I had access to put objects into the bucket by actually uploading things from the bucket via the AWS cli, and using aws sts get-caller-identity to make sure it was using the correct credentials. I don't have multiple profiles in my credentials file, either.

Desired Change

I want to be able to use streamalert, but it seems like i'm getting access denied putting the state file into the S3 bucket, even though i'm able to using the AWS cli using the same credentials

@jfrantz1-r7 So what is the name of the bucket for the backend? Also have you configured the prefix etc?

I'd revert back to the config you originally had and fix the name there. The one in used during unit tests varies slightly

@jack1902 isn't the name for the state bucket defined in the generated terraform? I don't see that by default in the globals.json file. This is the one that is pulled down when you run a git clone --branch release-3-0-0:

{
  "account": {
    "aws_account_id": "ACCOUNT_ID_GOES_HERE",
    "prefix": "PREFIX_GOES_HERE",
    "region": "us-east-1"
  },
  "general": {
    "matcher_locations": [
      "matchers"
    ],
    "rule_locations": [
      "rules"
    ]
  },
  "infrastructure": {
    "alerts_table": {
      "read_capacity": 5,
      "write_capacity": 5
    },
    "firehose": {
      "use_prefix": true,
      "buffer_interval": 900,
      "buffer_size": 128,
      "compression_format": "GZIP",
      "enabled": false,
      "enabled_logs": {}
    },
    "monitoring": {},
    "rule_staging": {
      "cache_refresh_minutes": 10,
      "enabled": false,
      "table": {
        "read_capacity": 20,
        "write_capacity": 5
      }
    },
    "s3_access_logging": {
      "bucket_name": "specify bucket name here"
    },
    "classifier_sqs": {
      "use_prefix": true
    }
  }
}

So i completely recloned the repo, I ran:

python3.7 manage.py configure aws_account_id 111111111111
python3.7 manage.py configure prefix satestprefix
python3.7 manage.py init

and i get a timeout when creating the S3 buckets. They create, but it's failing on the put request for the actual state file. I'm a little confused as to why, as it clearly has the ability. The name of the bucket is satestprefix-streamalert-terraform-state.

Error: Error putting S3 logging: timeout while waiting for state to become 'success' (timeout: 2m0s)

  on main.tf.json line 78, in resource.aws_s3_bucket.streamalerts:
  78:       },



Error: Error putting S3 logging: timeout while waiting for state to become 'success' (timeout: 2m0s)

  on main.tf.json line 99, in resource.aws_s3_bucket.terraform_remote_state:
  99:       }
DEBUG: Validate Response s3/PutBucketLogging failed, attempt 10/25, error InternalError: We encountered an internal error. Please try again.
<BucketLoggingStatus xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><LoggingEnabled><TargetBucket>specify bucket name here</TargetBucket><TargetPrefix>satestprefix-streamalerts/</TargetPrefix></LoggingEnabled></BucketLoggingStatus>

So one issue is, by default, we aren't doing what we say we're doing in the docs:

image

So i just retried by changing the globals.json file, back to the 403 put request as in the original issue.

The bucket DOES exist though when you check within the account and region stated within the globals.json file?

Yes, it does exist

This seems like it's just a permissions issue that is not something we can/should control, no?

Also, fwiw - I think that that you might want to be using a different region (not the default us-east-1) if you're getting timeouts when creating buckets, etc.