Oracle Data Pump utilities

Use case: we want to do export and import one big Oracle database via Data Pump tool. This big database has hundreds tables. Some are really big and some are not. We need to divide tables of this database into groups so that we can do parallel export and import for each individual group.

1 data pump export

1.1 Get all table names using queries below

SELECT OWNER || '.' || OBJECT_NAME FROM ALL_OBJECTS WHERE OWNER = 'SCHEMA_NAME' AND  OBJECT_TYPE = 'TABLE';

OR

SELECT 'SCHEMA_NAME' || '.' || OBJECT_NAME FROM USER_OBJECTS WHERE  OBJECT_TYPE = 'TABLE';

replace SCHEMA_NAME here with real schema-name.

1.2 Divide tables into groups

We divide tables into groups and each group will have some portion of tables. You are trying to create balanced groups. Group with big table will have small portion of tables. Name conventions for each group are defined as below. For example, group 1 related are defined as below.

# export par file 
expdp_{schema_name}_grp1.par 
# export dump file 
expdp_{schema_name}_grp1%u.dump
# export log file 
expdp_{schema_name}_grp1.log
# export job name 
expdp_{schema_name}_grp1

The export par file of group1- expdp_{schema_name}_grp1.par is defined as below.

userid='/ as sysdba'
CONTENT=ALL
DIRECTORY=dpump_dir
parallel=3
TABLES=
SCHEMA_NAME.TABLE1,
SCHEMA_NAME.TABLE2,
SCHEMA_NAME.TABLE3,
SCHEMA_NAME.TABLE4,
SCHEMA_NAME.TABLE5,
SCHEMA_NAME.TABLE6,
SCHEMA_NAME.TABLE7,
SCHEMA_NAME.TABLE8,
SCHEMA_NAME.TABLE9,
SCHEMA_NAME.TABLE10,
SCHEMA_NAME.TABLE11,
SCHEMA_NAME.TABLE12,
SCHEMA_NAME.TABLE13,
SCHEMA_NAME.TABLE14,
SCHEMA_NAME.TABLE15,
SCHEMA_NAME.TABLE16,
SCHEMA_NAME.TABLE17,
SCHEMA_NAME.TABLE18

dumpfile=expdp_SCHEMA_NAME_grp1%u.dmp
JOB_NAME=expdp_SCHEMA_NAME_grp1
logfile=expdp_SCHEMA_NAME_grp1.log

flashback_time="TO_TIMESTAMP (TO_CHAR (SYSDATE, 'YYYY-MM-DD HH24:MI:SS'), 'YYYY-MM-DD HH24:MI:SS')"
filesize=4G

Replace with real SCHEMA_NAME and TABLE_NAME you got from last step.

1.3 Run real data pump export

expdp parfile=expdp_{schema_name}_grp1.par  &
expdp parfile=expdp_{schema_name}_grp2.par  &
expdp parfile=expdp_{schema_name}_grp3.par  &
expdp parfile=expdp_{schema_name}_grp4.par  &
expdp parfile=expdp_{schema_name}_grp5.par  &

2 data pump import

Name conventions of import files:

# impdp par file 
impdp_{schema_name}_grp1.par 
# exported dump file 
expdp_{schema_name}_grp1%u.dump
# import log file 
impdp_{schema_name}_grp1.log
# import job name 
impdp_{schema_name}_grp1

The import par file of group1- impdp_{schema_name}_grp1.par is defined as below.

userid='/ as sysdba'
CONTENT=DATA_ONLY
DIRECTORY=dpump_dir
parallel=3
TABLES=
SCHEMA_NAME.TABLE1,
SCHEMA_NAME.TABLE2,
SCHEMA_NAME.TABLE3,
SCHEMA_NAME.TABLE4,
SCHEMA_NAME.TABLE5,
SCHEMA_NAME.TABLE6,
SCHEMA_NAME.TABLE7,
SCHEMA_NAME.TABLE8,
SCHEMA_NAME.TABLE9,
SCHEMA_NAME.TABLE10,
SCHEMA_NAME.TABLE11,
SCHEMA_NAME.TABLE12,
SCHEMA_NAME.TABLE13,
SCHEMA_NAME.TABLE14,
SCHEMA_NAME.TABLE15,
SCHEMA_NAME.TABLE16,
SCHEMA_NAME.TABLE17,
SCHEMA_NAME.TABLE18

dumpfile=expdp_{schema_name}_grp1%u.dump
JOB_NAME=imp_SCHEMA_NAME_grp1
logfile=impdp_SCHEMA_NAME_grp1.log

logtime=all
TABLE_EXISTS_ACTION=TRUNCATE

2.1 Run import

impdp parfile=impdp_{schema_name}_grp1.par  &
impdp parfile=impdp_{schema_name}_grp2.par  &
impdp parfile=impdp_{schema_name}_grp3.par  &
impdp parfile=impdp_{schema_name}_grp4.par  &
impdp parfile=impdp_{schema_name}_grp5.par  &

3 import individual tables

In this cases, we just want to import individual tables across all groups, not all of them.

create the table_dmp_file.mapping file,

expdp_SCHEMA_NAME_grp1.par:TABLE1
expdp_SCHEMA_NAME_grp1.par:TABLE2
expdp_SCHEMA_NAME_grp1.par:TABLE3
expdp_SCHEMA_NAME_grp1.par:TABLE4
expdp_SCHEMA_NAME_grp1.par:TABLE5
expdp_SCHEMA_NAME_grp1.par:TABLE6
expdp_SCHEMA_NAME_grp1.par:TABLE7
expdp_SCHEMA_NAME_grp1.par:TABLE8
expdp_SCHEMA_NAME_grp1.par:TABLE9
expdp_SCHEMA_NAME_grp1.par:TABLE10
expdp_SCHEMA_NAME_grp2.par:TABLE11
expdp_SCHEMA_NAME_grp2.par:TABLE12
expdp_SCHEMA_NAME_grp2.par:TABLE13
expdp_SCHEMA_NAME_grp2.par:TABLE14
expdp_SCHEMA_NAME_grp2.par:TABLE15
expdp_SCHEMA_NAME_grp2.par:TABLE16
expdp_SCHEMA_NAME_grp3.par:TABLE17
expdp_SCHEMA_NAME_grp3.par:TABLE18
expdp_SCHEMA_NAME_grp3.par:TABLE19
expdp_SCHEMA_NAME_grp3.par:TABLE20
expdp_SCHEMA_NAME_grp3.par:TABLE21
expdp_SCHEMA_NAME_grp3.par:TABLE22
expdp_SCHEMA_NAME_grp4.par:TABLE23
expdp_SCHEMA_NAME_grp4.par:TABLE24
expdp_SCHEMA_NAME_grp4.par:TABLE25
expdp_SCHEMA_NAME_grp4.par:TABLE26
expdp_SCHEMA_NAME_grp4.par:TABLE27
expdp_SCHEMA_NAME_grp4.par:TABLE28
expdp_SCHEMA_NAME_grp4.par:TABLE29
expdp_SCHEMA_NAME_grp4.par:TABLE30
expdp_SCHEMA_NAME_grp4.par:TABLE31
expdp_SCHEMA_NAME_grp4.par:TABLE32
expdp_SCHEMA_NAME_grp4.par:TABLE33
expdp_SCHEMA_NAME_grp5.par:TABLE34
expdp_SCHEMA_NAME_grp5.par:TABLE35
expdp_SCHEMA_NAME_grp5.par:TABLE36
expdp_SCHEMA_NAME_grp5.par:TABLE37

Create a template impdp par file - impdp_SCHEMA_NAME_TABLE_TEMPLATE.par

userid='/ as sysdba'
CONTENT=DATA_ONLY
DIRECTORY=dpump_dir
parallel=3
TABLES=
SCHEMA_NAME.TABLE_NAME

dumpfile=DMPFILELOCATION
JOB_NAME=impdp_SCHEMA_NAME_TABLE_NAME
logfile=impdp_SCHEMA_NAME_TABLE_NAME.log

logtime=all
TABLE_EXISTS_ACTION=TRUNCATE

Generate import par file for each table via this script run_datapump_import_generate.sh

#!/bin/bash
SCHEMA_NAME=SCHEMA_NAME
DMP_FOLDER=dpump_dir
IMPORT_PAR_TEMPLATE=impdp_SCHEMA_NAME_TABLE_TEMPLATE.par

while read line
do
     TABLE_MAPPING=$line
     IFS=':' read -r -a arr <<< "$TABLE_MAPPING"
     DMP_FILENAME=${DMP_FOLDER}/${arr[0]}%u.dmp
     TABLE_NAME=${arr[1]}
     IMPORT_DMP_PAR_FILE=$DMP_FOLDER/impdp_${SCHEMA_NAME}_$TABLE_NAME\.par
     sed -e "s/TABLE_NAME/$TABLE_NAME/g" -e "s/DMPFILELOCATION/$DMP_FILENAME/g" $IMPORT_PAR_TEMPLATE > $IMPORT_DMP_PAR_FILE
done < table_dmp_file.mapping

import the individual table one by one via run_datapump_import_individual_table.sh

#!/bin/bash
SCHEMA_NAME=SCHEMA_NAME
DMP_FOLDER=dpump_dir

while read line
do
     TABLE_MAPPING=$line
     IFS=':' read -r -a arr <<< "$TABLE_MAPPING"
     TABLE_NAME=${arr[1]}
     IMPORT_DMP_PAR_FILE=$DMP_FOLDER/impdp_${SCHEMA_NAME}_${TABLE_NAME}\.par
     # do impdp
     impdp parfile=$IMPORT_DMP_PAR_FILE
done < table_dmp_file.mapping