/cci-dataset-from-prod

Demo of using SFDMU to build a CCI dataset from a prod org

Primary LanguagePython

Work in progress

Do not bother trying to figure out what this repo is for. It's not done yet.

Ideas of "validation tables":

  • EDA Account where RecordType.DeveloperName is Educational_Institution (unique external nonrequired ID = Account.hed__School_Code__c) (referenced by hed__Education_History__c.Account__c) (not sure which record type, but another record type is referenced by hed__Application__c.hed__Applying_To__c)
  • hed__Term__c (referenced by hed__Application__c.hed__Term__c)
  • hed__Language__c (referenced by Contact_Language__c.Language__c)
  • summit__Summit_Events__c
  • summit__Summit_Events_Instance__c
  • EASY Application_Control__c (referenced by EASY Requirement__c.Application_Control__c and EASY Application__c.Application_Control__c)
  • EASY Requirement__c (referenced by EASY Requirement_Item__c.Requirement__c)
  • EASY Requirement_Item__c (referenced by EASY Question__c.Requirement_Item__c)
  • EASY Question__c (referenced by EASY Question_Response__c.Question__c)

Commands

sfdx sfdmu:run --sourceusername cci-dataset-from-prod-demo__feature --targetusername csvfile --path sfdmu-play
cci task run snowfakery --recipe fakes/snowfake.yml --org feature
cci task run load_dataset -o mapping ccidataplay/mp.yml -o sql_path ccidataplay/dt.sql --org feature
cci task run try_a_python --org feature
sf deploy metadata --source-dir force-app --target-org cci-dataset-from-prod-demo__feature
cci task run extract_dataset -o mapping ccidataplay/mp.yml -o database_url sqlite:///ccidataplay/db.db --org feature

SFDMU config

Here is an example export.json that mostly ignores lookup fields (lookup_false) unless they've been deemed important (don't forget to define their externalId elsewhere in export.json):

{
    "objects": [
        {
            "operation": "Readonly",
            "externalId": "hed__School_Code__c, Name",
            "query": "SELECT updateable_true, lookup_false, hed__Current_Address__c, RecordType.DeveloperName FROM Account",
            "excludedFields": "IsPartner, IsCustomerPortal, CleanStatus, hed__Billing_Address_Inactive__c"
        }
    ]
}

Thoughts

  1. I get that my goal is to use the handwritten Python to parse SFDMU files into mapme.yml and parts of loadme.sql.
  2. But with what, exactly, am I going to generate the field datatypes in loadme.sql?
    • Am I going to use some sort of variation on the code found in https://github.com/kkgthb/download-salesforce-objects-and-fields-as-json ?
    • Am I going to admit that this command is pretty darned good at the field type inference and do some sort of back-and-forth between running SFDMU, running custom Python to generate mapme.yml off the SFDMU, running this command to parse the mp.yml into a loadme_draft.sql, and then running custom Python to tear the loadme_draft.sql to shreds, keeping just greatest hits from the CREATE TABLE statements for loadme.sql? That sounds inefficient data-downloading-wise if I can't limit extract_dataset to 1 row per object.
      cci task run extract_dataset -o mapping ccidataplay/mp.yml -o sql_path ccidataplay/extracted.sql --org feature

1/6/23, 1:07PM: Maybe I should be editing db.db in place, since it's already got all the data types correct. Theoretically, I just need to clean up primary keys and foreign key references. 2:35PM: I could even use SF schema info to add "ON UPDATE CASCADE" constraints to foreign key fields in data tables, and then I could just update the primary keys. I think.

That or I could have both DB's open at once and insert between them maybe