bulk-insert-update-insert-conflict: A Python repository from arif2009

Python command-line script for PostgreSQL bulk insert, update, update-on-conflict

Nowadays data is one of the important things in the digital world. So most of the time we need to insert a list of data into the database. To insert a list of data into the database normally we do these things:

Traverse all list and insert one item at a time
Traverse all list and generate a query and insert all data at a time
Without traversing we can insert data at a time from the list (bulk insert)

#1 and #2 both process seems costly and complicated for me, so here we going to learn how can we implement #3 in Postgre database using python script.

Besides that, we learn some more things like:

Bulk update
Bulk update on conflict
How to write Python command-line script
Optional and mandatory command-line argument

Tools & Technology :

Configuration (put the appropiate info into .env file) :

DB_NAME=xyz
DB_USER=xyz
DB_PASS=xyz
DB_PORT=5432
DB_HOST=localhost

Installation :

Create database named BulkDB and table named bulktable. You can create it manually or execute script sql.sql found in the project root directory.

Create virtual environment(venv)

$ pip install virtualenv
$ virtualenv -p python3 venv
$ source ./venv/Scripts/activate (windows)
$ source venv/bin/activate (linux)

Install dependences

$ pip install -r requirements.txt

Usages:

Bulk insert:

python main.py ./data/data.json -o insert

Bulk update:

python main.py ./data/data.json -o update

Bulk update on conflict:

python main.py ./data/data.json -o update-on-conflict

Explaination:

list structure we are going to insert into database:

data = [{
    "uuid": 4294967812,
    "data": "2017-01-01",
    "min": 11.0,
    "max": 224.0,
    "avg": 0.6326606157518295
  },
  {
    "uuid": 4294967813,
    "data": "2017-01-01",
    "min": 0.0,
    "max": 224.0,
    "avg": 12.007262137125508
  }]