A ruby script (for *nix) to search a database for related records based given a value or a column name and value. Recursively follows and outputs related rows. Attempts to avoid excessive recursion via related record row limit default of 1000 (otherwise can be slow and may run out of memory), a default max related table depth level of 5, etc. Can also output/dump all data in specific tables or tables under a specified row count, which can be used along with diff to provide a selective table schema diff. A more complete and accurate tool for diffs and dumps is SchemaCrawler.
Install Git and Ruby.
Install RBI and Trollop:
gem install dbi gem install trollop
Clone this project.
cd ~ git clone http://github.com/garysweaver/schema-sleuth.git
Then add the following to your .bash_profile, or whatever you want:
#schema-sleuth export PATH=$PATH:~/schema-sleuth
Restart Terminal.app or whatever, and then test by doing:
ssleuth -h
Finds related data in a database. Examples: MySQL: ssleuth -r DBI:Mysql:TESTDB:localhost -u jdoe -p secret -c user_id -e 1234 Oracle: ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 Follow NEXT_USER_ID and PREV_USER_ID as if were USER_ID without displaying progress indicator: ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -q -x 'next_,prev_' Dump all data in tables (under maximum row count) to a list of inserts ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -e 1234 -q -i -a -m Ignore tables ending in _LOG or _STAT ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -b ^\(?\!\(\(.*_LOG\)\|\(.*_STAT\)\)\) Custom: ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -t "select TABLE_NAME from USER_TABLES" -n "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'" Usage: ssleuth [options] where [options] are: --driver-url, -r <s>: DBI driver URL. e.g. DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD --user, -u <s>: database username. e.g. jdoe --password, -p <s>: database password. e.g. secret --column-name, -c <s>: column name (case-insensitive). e.g. user_id --value, -e <s>: column value. e.g. 1234 --table-names-query, -t <s>: SQL query to use to get full listing of tables. e.g. "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'" --column-names-query, -n <s>: SQL query to use to get column names for a table. Substitute $TABLE_NAME for the table name. e.g. "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'" --generate-deletes, -l: output as delete statements --generate-inserts, -i: output as insert statements --generate-updates, -w: output as update statements --maximum-depth, -s <i>: maximum depth (default: 5) --maximum-row-count, -m <i>: maximum row count. Does not attempt to follow path if more than this many rows are found from a query. However, you can still get more than one row in result summaries, via multiple queries (default: 1000) --likes, -k: value comparisons done as likes rather than equals --column-name-partials-to-remove, -x <s>: comma-delimited list of (case-insensitive) text to remove from column names when following path. e.g. to make next_user_id and user_id match, you'd specify -x 'next_' --table-pattern, -b <s>: table names must match this regexp pattern be included, otherwise are excluded --all-data, -a: instead of following relationships, just output all table data --quiet, -q: quiet. does not output progress indicator --debug, -d: outputs debugging information --version, -v: Print version and exit --help, -h: Show this message
This would find any record with value “1234” in any column in any table in the database and use that as a starting point:
ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -e 1234
This would find any record with value “1234” in any “USER_ID” column in any table in the database and use that as a starting point:
ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234
This would dump all tables under 10 million rows from two database schemas to db1.txt and db2.txt as insert statements. This may take a really long time. Then it would diff these two files and put the result into changes.txt:
ssleuth -r DBI:OCI8://db1.acme.org:2345/ACMETEST -u jdoe -p secret -a -q -i -m 10000000 > db1.txt && ssleuth -r DBI:OCI8://db2.acme.org:1234/ACMETEST -u jdoe -p secret -a -q -i -m 10000000 > db2.txt && diff db1.txt db2.txt > changes.txt
This would dump all tables under 10 million rows where the table names start with AUTO from two database schemas to db1.txt and db2.txt as insert statements. Then it would diff these two files and put the result into autochanges.txt:
ssleuth -r DBI:OCI8://db1.acme.org:2345/ACMETEST -u jdoe -p secret -a -q -i -b "AUTO(.*)" -m 10000000 > db1.txt && ssleuth -r DBI:OCI8://db2.acme.org:1234/ACMETEST -u jdoe -p secret -a -q -i -b "AUTO(.*)" -m 10000000 > db2.txt && diff db1.txt db2.txt > autochanges.txt
MySQL:
ssleuth -r DBI:Mysql:TESTDB:localhost -u jdoe -p secret -c user_id -e 1234
Oracle:
ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234
Custom:
ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234 -t "select TABLE_NAME from ALL_ALL_TABLES" -n "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"
Searching for records related to column USER_ID and value ‘123’:
$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 123 ....... SAMPLE_TABLE_A USER_ID, USER_NAME ------------ '123', 'Joe' SAMPLE_TABLE_B USER_ID, VEHICLE_ID ------------ '123', '234' '123', '235' SAMPLE_TABLE_C VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID ------------ '234', 'Kia', 'Sorento', '345' '235', 'Kia', 'Sportage', '346' SAMPLE_TABLE_D COLOR_ID, COLOR_NAME ------------ '345', 'Red' '346', 'Green'
Output the same data as insert statements:
$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 123 -i ....... INSERT INTO SAMPLE_TABLE_A (USER_ID, USER_NAME) VALUES ('123', 'Joe'); INSERT INTO SAMPLE_TABLE_B (USER_ID, VEHICLE_ID) VALUES ('123', '234'); INSERT INTO SAMPLE_TABLE_B (USER_ID, VEHICLE_ID) VALUES ('123', '235'); INSERT INTO SAMPLE_TABLE_C (VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID) VALUES ('234', 'Kia', 'Sorento', '345'); INSERT INTO SAMPLE_TABLE_C (VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID) VALUES ('235', 'Kia', 'Sportage', '346'); INSERT INTO SAMPLE_TABLE_D (COLOR_ID, COLOR_NAME) VALUES ('345', 'Red'); INSERT INTO SAMPLE_TABLE_D (COLOR_ID, COLOR_NAME) VALUES ('346', 'Green');
Find all records containing ‘1234’:
$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -e 1234 SAMPLE_TABLE_E ITEM_ID, AMOUNT ------------ '12345', '1234' SAMPLE_TABLE_F BUILDING_ID, IDENT_ID, DESCRIPTION ------------ '15', '2424', '1234' '1234', '12', 'Green' SAMPLE_TABLE_G VEHICLE_ID, CUSTOMER_ID ------------ '1234', '2424' '1234', '3555'
Specify -d to debug. If debug is not specified, you can see still see progress via the dots, etc. printed. A ‘.’ means that data was retrieved. A ‘M’ indicates that the maximum-row-count was exceeded. A ‘O’ indicates that the maximum-depth was exceeded. If it encounters an error, it will display it in full glory.
Copyright © 2010-2011 Gary S. Weaver, released under the MIT license.