db_converter
is an open-source database migration tool for PostgreSQL 9.6+ designed for high-loaded installations.
With db_converter
complex tasks become easier.
- Introduction
- Dependencies and installation
- Fast start
- How to install and run
- Terminology
- Usage modes
- Documentation
- Roadmap
- Support and contributions
The primary goal of db_converter
is to simplify the database conversion (migration) process as much as possible while maintaining flexibility and functionality.
Tasks that can be solved using db_converter
:
- Transactional modification of data of any volume
- Database structure changing with locks control
- System and application notifications via
mattermost
(or any other messenger) - Database maintenance (deleting old data, creating new schemas, etc.)
- Export data in
CSV
format into an encrypted archive
The key features are:
- Only plain SQL scripts with placeholders
- Parallel processing of several databases
- Handling of the locks to avoid impact on the regular workload
Python 3.x with modules: sqlparse
, requests
, pyzipper
yum install -y python38 # if Python 3.x is not installed
# if pip is not installed
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.8 get-pip.py
pip3.8 install sqlparse
pip3.8 install requests
pip3.8 install pyzipper
Built-in module py-postgresql.
Download and run container:
docker pull masterlee998/db_converter:dbc_pg13
docker run --name dbc -d masterlee998/db_converter:dbc_pg13
docker exec -it dbc bash
First, needs to install python and modules (see the section above).
Next, clone db_converter
from GitHub:
git clone https://github.com/masterlee998/db_converter.git
cd db_converter
python3 db_converter.py --version
>> Version 1.3
Prepare db_converter.conf
:
mv conf/db_converter.conf.example conf/db_converter.conf
# set connection credentials to the database
test_conn='pq:\/\/some_user:password@127.0.0.1:5400\/test_db_1'
sudo sed -ie "s/^test_db_1.*/dbc = $test_conn/" conf/db_converter.conf
Run dba_get_conf
packet (just displays basic DB configuration options):
# run read-only packet
python3 db_converter.py \
--packet-name=dba_get_conf \
--db-name=dbc
# Info: =====> DBC 1.0 started
# Info: =====> Hold lock for packet dba_get_conf in DB dbc
# Info: Thread 'lock_observer_dbc' runned! Observed pids: []
# Info: --------> Packet 'dba_get_conf' started for 'dbc' database!
# Info: lock_observer_dbc: iteration done. Sleep on 5 seconds...
# Info: Thread 'lock_observer_dbc': Observed pids: []
# Info: Thread 'ro_manager_db_dbc', DB 'dbc', PID 24160, Packet 'dba_get_conf', ...
# Info:
# ---------------------------------------------------------------------------
# | name | value | pretty_value | boot_val | unit |
# ---------------------------------------------------------------------------
# | autovacuum_max_workers | 3 | | 3 | None |
# | autovacuum_naptime | 60 | | 60 | s |
# ....
# Info: <-------- Packet 'dba_get_conf' finished for 'dbc' database!
# Info: Thread lock_observer_dbc finished!
# Info: <===== DBC 1.0 finished
# run in background
nohup python3 db_converter.py \
--packet-name=my_packet \
--db-name=db01
> /dev/null 2>&1 &
tail -f log/dbc_db01_my_packet.log
# run all tests
python3 tests/test_packets.py -v
# run specific test
python3 tests/test_packets.py -v TestDBCLock
Packet - is a package of changes (a directory with SQL files) that apply to the specified database. Packet contains meta_data.json
(an optional file with meta-information describing the package) and several SQL files in XX_step.sql
format.
Step - is a SQL file, the contents of which are executed in one transaction, and containing the following types of commands:
- DDL (Data Definition Language) - CREATE, DROP, ALTER, TRUNCATE, COMMENT, RENAME
- DML (Data Manipulation Language) - SELECT, INSERT, UPDATE, DELETE
- DCL (Data Control Language) - GRAND, REVOKE
Action - is a transaction formed on the basis of step
. If the step
does not have a generator
, then it creates one action
. If the step
has a generator
, then several transactions will be generated.
Generator - is a SQL file associated with some step
by index number. If there is a generator
, the step
contains placeholders for substituting the values returned by the generator
(for more details see the "Generators and Placeholders" section).
Conversion (migration, deployment) - is a transformation of the database structure according to the specified package of changes.
When executing Packet
, SQL files are applied to the specified database sequentially by the index.
db_convertrer
works in the following modes:
-
List all target databases according
--db-name
mask if the--list
key is specified -
Perform deployment - deploy the specified
packet
to the target database--db-name
-
Perform force deployment - forced deployment if the
--force
key is specified - ignore the difference between hashes of apacket
at the time of repeated execution and at the time of the first launch -
Perform sequential deployment if the
--seq
key is specified, then parallel execution is disabled (if several databases are selected), and all databases are processed sequentially according to the selected list. db_converter can process several databases in parallel. The possibility of parallelizing the conversion of one database does not make sense. -
Check packet status - display
packet
status if the--status
key is specified -
Wipe packet deployment history if the
--wipe
key is specified. Wipe means delete fromdbc_ *
tables. Removing information about an installed package can be used for debugging purposes. -
Unlock unexpectedly aborted deployment if the
--unlock
key is specified -
Stop all active transactions of unexpectedly aborted deployment if the
--stop
key is specified. It this mode, all active connections will be terminated matching withapplication_name
(specified in thedb_converter.conf
configuration file) +"_"
+--packet-name
-
Use template packet - copy
*.sql
files frompackets/templates/template
topackets/packet-name
if the--template
key is specified
Auxiliary deployment modes also provided:
-
Skip the whole step on the first error like
Deadlock
,QueryCanceledError
if the--skip-step-cancel
key is specified -
Skip action errors like
Deadlock
,QueryCanceledError
if the--skip-action-cancel
key is specified
In all deployment modes, two parameters are mandatory:
-
--db-name
- a name of directory located inpackets
-
--packet-name
- a name of one database or a comma-separated list of databases, orALL
to automatically substitute all databases listed indb_converter.conf
- Files layout and configuration
- Command-line parameters
- Action tracker
- meta_data.json description
- Transactional processing of objects and data
- Threads
- Internals
- Use cases
- Synchronization of packets between a source and target databases
- Query bot in mattermost: running packets from the browser instead of command line
- Web interface (command line replacement, collaborative work)
-
- scheduling periodic tasks to run
-
- packets development
-
- testing
-
- approving
-
- delivery to production
-
- viewing logs and deployment statuses
Bug reports and new features are appreciated and may be filed through the issue tracker.
Feel free to get technical support in Gitter.