For a database migration, schema validation plays a vital role. This Python based validation tool helps in finding the gaps between source and target database objects like Tables, Views, Indexes, Keys, Stored procedures, Functions, Triggers and Sequences with encrypted connection. It generates an output summary and a detailed report in formats such as Microsoft Excel (xlsx) and html. The report explains the object level match percentage (%) per database, per schema and list of missing objects at target referring to the source.
The following diagram explains the high level flow of the tool considering on-premise as source and AWS Cloud as target
Note: Tool can be deployed either on source or target system following the pre-requisites.
- Python 3.10.x
- ODBC Driver for SQL Server
- Connectivity to the source and target databases from the host machine.
- "Select" permissions on metadata tables or views for the user executing queries on source and target databases.
- AWS CLI configured in the host machine.
Note: Make sure the user has admin privileges to install the pre-requisites.
- Currently supported on Windows and macOs.
Note: To fix few installation errors on macOs, please go through the following steps.
- To fix the error - "ERROR: Failed building wheel for pymssql", uninstall pymssql using the following command
pip3 uninstall pymssql
- Re-install pymssql
pip3 install pymssql
- To fix the error - "Error occurred while executing the MsSQL query: (pyodbc.Error) ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 18 for SQL Server' : file not found (0) (SQLDriverConnect)")" , install the ODBC driver for mac Operating system
- SQL Server
- Oracle
- PostgreSQL
- Mysql
Note: For Oracle to Postgres validation, please refer the post Schema and code validator for Oracle to Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL migration
Download the code from git repository to the host machine. Open the command prompt, and run one of the following commands to check the Python version installed
python --version
python3 --version
Based on the Python version installed on the host machine, navigate to the code downloaded directory, run one of the following commands with appropriate version requirements file.
For example if you have Python 3.10.x installed on your host machine, run the following command
python -m pip install -r requirements3.10.txt
python3 -m pip install -r requirements3.10.txt
The details for the databases such as host names, username, password, etc. need to be specified in separate AWS Secret Manager secrets which provide more security for your credentials.
The following steps explain how to create a secret:
-
Open the Secrets Manager Console
-
Ensure that you are in the correct region
-
Click on Store a new secret
-
Select Other type of secret
-
Under Plaintext specify your database details in the following manner:
{ "username": "<username>", "password": "<password>", "host": "<database_hostname>", "port": "<port>", "database_name": "<database_name>" }
-
Select the Encryption key to be used for secret encryption. Ensure that the role/user used on the machine running the tool has appropriate permissions to access the key and the secret. You may refer to this document
-
Specify name and other details for the secret and other details and save the secret and note the name for it as it will be used in the next step.
Provide the details for the source, target, file-format, Secrets manager region and logging level in the configuration file "configurations.ini" in conf folder of tool source directory. Specify the details of the Secrets Manager secret for source and target, file format and logging details in the following format.
[source]
SOURCE_DATABASE_TYPE = mssql
SOURCE_SECRET_ID = <source_database_secrets_id>
[target]
TARGET_DATABASE_TYPE = postgres
TARGET_SECRET_ID = <target_database_secrets_id>
[region]
SECRET_REGION = <secrets_manager_secret_region>
[file-format]
FILE_FORMAT = html
[logging]
DEBUG_LEVEL = INFO
The following are the allowed input values for SOURCE_DATABASE_TYPE, TARGET_DATABASE_TYPE and FILE_FORMAT respectively:
SQL Server = mssql
MySQL = mysql
Oracle = oracle
PostgreSQL = postgres
FILE_FORMAT = html
FILE_FORMAT = xlsx
Also ensure that the AWS credentials have been setup on the machine where the tool is ran. You can follow this document for setting up access to AWS.
To generate a output validation report, execute one of the following commands
python src/main.py
python3 src/main.py
Note: In case of any error related to Microsoft ODBC Driver Manager, please install related driver from link and re-execute the report generation command.
Tool will generate a report with name in format of "migration_summary_SOURCE_to_TARGET_TIMESTAMP.FILEFORMAT" to the output folder in the tool source directory.
Example: File - /Users/username/Downloads/multi-database-schema-validator-main/output/migration_summary_mssql_to_postgres_20230525_084217.html
Detail level logging is written to the logs folder in the tool source directory.
Example: /Users/username/Downloads/multi-database-schema-validator-main/logs/run_log_20230525_084208.log
Note: These details can also be set using environment variables with the same names and those would override the details in the configuration
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.