Distributed Transactions with Subset Schema Design Pattern

1. Setup a MongoDB 4.2 Sharded Cluster on MongoDB Atlas (2 shards is sufficient)

2. Setup the IMDB and IMDB_Cast databases on Shard0 and Shard1 respectively as follows:

use IMDB
db.dropDatabase()
sh.enableSharding("IMDB");
sh.shardCollection("IMDB.movies_metadata", { _id: 1 } );
sh.shardCollection("IMDB.movies_credits", { _id: 1 } );
db.adminCommand({ movePrimary: "IMDB", to: "sh_0" });

use IMDB_Cast
db.dropDatabase()
sh.enableSharding("IMDB_Cast");
sh.shardCollection("IMDB_Cast.cast_crew", { _id: 1 } );
db.adminCommand({ movePrimary: "IMDB_Cast", to: "sh_1" });

3. Extract the IMDB_data.zip file. This contains the 'IMDB__movies_credits.json', 'IMDB__movies_metadata.json' and 'IMDB_Cast__cast_crew.json' files.__

4. Import the data to each database as follows:

mongoimport --host YOURHOST --ssl --username YOURUSER --password YOURPWD --authenticationDatabase admin --db IMDB --collection movies_credits --type JSON --file IMDB__movies_credits.json

mongoimport --host YOURHOST --ssl --username YOURUSER --password YOURPWD --authenticationDatabase admin --db IMDB --collection movies_metadata --type JSON --file IMDB__movies_metadata.json

mongoimport --host YOURHOST --ssl --username YOURUSER --password YOURPWD --authenticationDatabase admin --db IMDB_Cast --collection cast_crew --type JSON --file IMDB_Cast__cast_crew.json

5. Execute the script SubsetPatternWithIMDB_DistTx.js. This will traverse the first 100 documents in the IMDB.movies_credits collection and populate the IMDB_Cast.cast_crew.roles and IMDB.movies_metadata.main_actors arrays using a distributed transaction.

6. Proof: The IMDB_Cast.cast_crew collection will have a roles array with a subset of the information of every movie the actress/actor as starred in. The IMDB.movies_metadata collection will have a main_actors array with a subset of the information of every actress/actor that starred in the movie.

snarvaez/mdb-distributedtx-subset

Distributed Transactions with Subset Schema Design Pattern