/dms-cdk

Primary LanguageTypeScriptMIT No AttributionMIT-0

Introduction

This CDK application contains CDK constructs to provision AWS DMS (data migration service) related resources. The solution provided is primarily based on RDS MySQL database as the source and target database used for migration. The

  1. A DMS Subnet group : Subnet group allows DMS service to choose the subnet and IP address within the subnet. DMS service uses the IP addresses from this subnet group when provisioning resources like DMS replication instance.
  2. A DMS Replication Instance : This is an managed instances where DMS replication tasks are run
  3. DMS database endpoints : Database connections to source and target database
  4. DMS task(s) : Task(s) for performing data replication with configurable settings

Pre-requisite

  1. Connectivity: The DMS service and acts as "hub" between the source and target database. So, a connectivity is needed from DMS service to both the source and database .

    Note: If the source database in located on-premise ensure that the port (default 3306 for mysql) is opened by your corporate firewall. For configuring target endpoint, ensure that your target RDS security group allows this connectivity. The DMS service automatically validates the endpoint for connectivity during installation.

  2. Secrets Manager to store database configurations : The solution assumes all database configurations are stored in the AWS Secrets manager. Configure the secrets manager for both the source and target databases with necessary parameters like hostname, port, database, passwords needed for establishing the connectivity. Use command below to check the secrets in your account.

      aws secretsmanager list-secrets
    
    
  3. VPC endpoints for DMS, Secrets Manager. This ensures traffic is routed via AWS backbone and provides additional security. To check the VPC endpoint run the command below and check for json output for service name like 'com.amazonaws..dms' and 'com.amazonaws..secretsmanager'

      aws ec2 describe-vpc-endpoints | grep dms
      aws ec2 describe-vpc-endpoints | grep secretsmanager
    
    
  4. CDK v2.30.x+

  5. Nodejs v 17.x+

  6. Source and target databases (MySQL v8, Oracle and Aurora PostGresSQL) : Check the DMS endpoint in the AWS Console > Services > DMS and make sure the test connectiity is working.

Solution Architecture

High level architecture for DMS is shown below. The main components involved are

  1. A dedicated network (VPN or Direct connect) with a connectivity to the customers VPC (Virtual Private Cloud)
  2. Source database preferably with a dedicated slave. This helps to offload the database load coming from read operations during migration.
  3. A target RDS database instance where data is migrated.
  4. Source and target DMS endpoints with database credentials stored in AWS Secrets Manager
  5. DMS replication instance and a number of replication tasks for replicating the data
  6. DMS subnet group is where DMS service provisions resources into your subnet for connecting to databases. Remember that the DMS service itself is hosted inside AWS.

DMS Architecture

Installation

Ensure that all pre-requisites are in place. The source code is repository is located at https://github.com/aws-samples/dms-cdk/

  1. Install Nodejs and npm. See the link https://nodejs.org/en/download/package-manager/#macos. To check the version use the command below

      node --version
    
  2. Install CDK for typescript (https://docs.aws.amazon.com/cdk/latest/guide/getting_started.html) or use the command below to install and check the version of CDK

      npm install -g aws-cdk
    
      npm audit fix --force
    
      cdk --version
    
    
  3. Create a service role "dms-vpc-role" : A service role named "dms-vpc-role" is needed if your are deploying the DMS for the first time via CLI or cloudformation in your account. The following steps are necessary to deploy the "dms-vpc-role" IAM role. For more details see the link dms-vpc-role

    • Create json policy file - dmsAssumeRolePolicyDocument.json - with the following content:

      {
        "Version": "2012-10-17",
        "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
              "Service": "dms.amazonaws.com"
          },
        "Action": "sts:AssumeRole"
        }
      ]
      }
      
    • Create the required IAM role (dms-vpc-role)

        aws iam create-role --role-name dms-vpc-role --assume-role-policy-document file://dmsAssumeRolePolicyDocument.json
      
    • Attach the existing policy to the new role

        aws iam attach-role-policy --role-name dms-vpc-role --policy-arn arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole
      
      
  4. Create an aws profile named 'dms' (if you do not have one)

        aws configure --profile dms
    
  5. Configure cdk.json file in the project root folder. The cdk.json file is where you configure the target account to deploy your solution in addition to settings like vpc, subnet id, database schema for migrating the data. Example of cdk.json

    "context": {
        "environment": "dev",
        "account": "111111111111",
    
        "dev": {
          "region": "eu-central-1",
          "vpcId": "vpc-xxxxxxxxxxxxx",
          "subnetIds": [
            "subnet-xxxxxxxxxxxxxxxxa",
            "subnet-xxxxxxxxxxxxxxxxb"
          ],
          "vpcSecurityGroupIds": [
            "sg-xxxxxxxxxxxxxxxx"
          ],
          "schemas": [
            {
              "name": "demo-src-db",
              "sourceSecretsManagerSecretId": "arn:aws:secretsmanager:eu-central-1:111111111111:secret:dev/mysql/aaa-xxxxpp",
              "targetSecretsManagerSecretId": "arn:aws:secretsmanager:eu-central-1:111111111111:secret:dev/mysql/bbb-xxxxpp"
            }
          ],
          "replicationInstanceClass": "dms.r5.4xlarge",
          "replicationInstanceIdentifier": "dms-dev-eu",
          "replicationSubnetGroupIdentifier": "dms-dev-subnet-eu",
          "replicationTaskSettings": {
          },
          "migrationType": "full-load"
        }
      }
    
    
  6. Go to your project root directory. Compile and test the solution.

     cd dms-cdk
    
     npm install
    
     npm run build && test
    
  7. Deploy the solution to your account based on cdk.json configuration. Use the 'dms' profile created previously in the cdk deploy command

     npx cdk deploy --profile dms
    
  8. Post validation : If the deployment is sucessful you will see the cloudformation stack under AWS Services > Cloudformation in AWS console. Following resources are created by this deployment.

  • AWS::DMS::ReplicationSubnetGroup
  • AWS::DMS::ReplicationInstance
  • AWS::IAM::Role
  • AWS::IAM::Policy
  • AWS::DMS::Endpoint
  • AWS::DMS::ReplicationTask

CDK construct Overview

The solution described in this post relies mainly on 2 classes - DMSReplication and DMStack and uses out of the box CDK construct library '@aws-cdk/aws-dms' The solution is primarily designed for RDS MySQL database. However, it can easily be adopted or extended for use with other databases like PostgreSQL or Oracle.

  • DMSReplication class is a construct responsible for creating resources such as replication instance, task settings, subnet-group and IAM role for accessing AWS Secrets Manager. Note that all database credentials are stored in AWS secrets manager.

  • DMSStack class is a stack that leverages DMSReplication construct to provision the DMS resource(s) based on parameters defined in cdk.json

  • ContextProps is an interface that helps to map input parameters from cdk.json file in a type safe manner. The cdk.json file includes settings related to DMS resources like replication instance, task settings, logging etc. ContextProps also defines default values which can be overridden by you by defining it in the cdk.json file

    dms-cdk-classes

      import * as cdk from 'aws-cdk-lib';
      import DmsStack from '../lib/dms-stack';
    
      const app = new cdk.App();
      const dmsStack = new DmsStack(app, 'DmsOraclePostGresStack', {
        vpcId: 'vpc-id',
        subnetIds: ['subnet-1a', 'subnet-1b'],
        replicationInstanceClass: 'dms.r5.4xlarge',
        replicationInstanceIdentifier: 'test-repl-01',
        replicationSubnetGroupIdentifier: 'subnet-group',
        vpcSecurityGroupIds: ['vpc-sg'],
        engineVersion: '3.4.6',
        tasks: [
          {
            name: 'demo_stack',
            sourceSecretsManagerSecretId: 'sourceSecretsManagerSecretId',
            targetSecretsManagerSecretId: 'targetSecretsManagerSecretId',
            migrationType: 'cdc',
            engineName: 'oracle',
            targetEngineName: 'aurora-postgresql',
            tableMappings: {
              rules: [
                {
                  'rule-type': 'selection',
                  'rule-id': '1',
                  'rule-name': '1',
                  'object-locator': {
                    'schema-name': 'demo_test',
                    'table-name': '%',
                  },
                  'rule-action': 'include',
                },
              ],
            },
          },
        ],
        publiclyAccessible: true,
        allocatedStorage: 50,
        env: {
          account: '11111111111',
          region: 'eu-central-1',
        },
      });
    

Configuration Options

In the cdk.json file you define the DMS related settings.

Params Description Default Required
environment Name of your environment identifier to deploy the CDK application dev y
subnetIds Subnet ids of your VPC used in creation subnet group for DMS service. y
replicationInstanceClass The type of DMS instance class dms.t3.medium n
replicationSubnetGroupIdentifier The identifier of the subnet group where replication instance would be located n
replicationInstanceIdentifier The unique identifier of the replication instance y
replicationTaskSettings DMS task settings that overrides default values in task-settings.ts n
migrationType Is full load or change data capture (full-load or cdc) full-load y
sourceSecretsManagerSecretId ARN of secrets manager for source database credentials y
targetSecretsManagerSecretId ARN of secrets manager for target database y
allocatedStorage Storage space for the replication instance (should be based on log size) 50g n
engineName Supported source databases oracle, postgres, sqlserver, mysql mysql y
targetEngineName Supported target databases oracle, postgres, sqlserver, aurora-postgresql mysql y
engineVersion DMS engine version 3.4.6, 3.4.7 n
targetEngineVersion The name of target database engine (mysql, oracle and aurora-postgresql) mysql n
databaseName Name of the database to be migrated. Relavant for oracle, postgres... false n
publiclyAccessible DMS endpoint is publicly accessible or not false n

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.