freelawproject/free.law

Migrate data to Neon as applicable

Closed this issue · 9 comments

We need to migrate at least the following:

  • Donor data:
    • Names and personal info
    • Past donations
    • Referrer field for donations ("fund" vs. "source"?)
  • All current CL users (see freelawproject/courtlistener#3145 for keeping this synced)
  • All subscribers in MooSend
  • And probably some other things?

A few notes:

  • We have a donor that seems to change their user/email with each donation (Jenifer has the details)

Worksheet from Neon:

Data Inventory Worksheet.pdf

templates for Account information Import:
NeonCRM_Account_Data_Template.xlsx
Standard Account Fields.pdf

Donation Data Import:
instructor_euu5evbiub4cifk9p0zgz9wxn_public_1683148834_Supported+Donation+Data+Fields.pdf

Membership Data Import:
Standard_Membership_Data_Fields.pdf
NeonCRM_Membership_Data_Template.csv

**Need to identify custom fields that we have data to import
**look at data to check how to map into standard fields

Jenifer and I did a bunch of work yesterday and today to get the data imported. It's a combination of generating files on the server and doing a hand mapping/import. Overall, it wasn't too bad and we imported both the accounts (which were based on CL accounts), and the donations (based on CL donations).

Remaining todos:

  • Import recurring donations
  • Import from Moo Send

Accounts were mapped as follows:

image

And used the following python script:

import csv
from datetime import datetime
from cl.users.models import UserProfile
up_skip_fields = [
    "username", "plaintext_preferred", "key_expires", "avatar",
    "docket_default_order_desc", "is_staff", "is_tester",
    "email_confirmed",
    "password", "activation_key", "last_login", "recap_email", "user_id",
    "is_active", "unlimited_docket_alerts", "is_superuser",
    "auto_subscribe",
    "notes", "_state", "stub_account",
]
with open('accounts.csv', 'w', newline='') as csvfile:
    for i, up in enumerate(UserProfile.objects.filter(
        user__is_active=True
    ).exclude(
        user__first_name=""
    ).exclude(
        user__last_name=""
    ).exclude(
        user__first_name__contains="<",
    ).distinct().iterator()):
        if up is None or up.user is None:
             continue
        if i == 0:
            fieldnames = set(list(up.__dict__.keys()) + list(up.user.__dict__.keys()))
            for field in up_skip_fields:
                fieldnames.remove(field)
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            writer.writeheader()
        up_dict = up.__dict__.copy()
        up_dict.update(up.user.__dict__)
        for field in up_skip_fields:
            up_dict.pop(field)
        for k, v in up_dict.items():
            if isinstance(v, datetime):
                up_dict[k] = v.strftime("%m-%d-%Y")
        writer.writerow(up_dict)

Donations were mapped as follows:

image

And used the following code:

import csv
from datetime import datetime
from cl.donate.models import Donation
d_skip_fields = [
    "_state", "payment_id", "transaction_id", "date_created", "date_modified",
]
with open('donations.csv', 'w', newline='') as csvfile:
    for i, d in enumerate(Donation.objects.exclude(referrer__icontains="XERO").iterator()):
        if i == 0:
            fieldnames = list(d.__dict__.keys())
            for field in d_skip_fields:
                fieldnames.remove(field)
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            writer.writeheader()
        d_dict = d.__dict__.copy()
        for field in d_skip_fields:
            d_dict.pop(field)
        # If date_donation is blank, set it to date created. This is needed
        # for credit cards that didn't go through and checks.
        if not d_dict['clearing_date']:
            d_dict['clearing_date'] = d.date_created
        for k, v in d_dict.items():
            if isinstance(v, datetime):
                d_dict[k] = v.strftime("%m-%d-%Y")
        d_dict["status"] = d.get_status_display()
        d_dict["payment_provider"] = d.get_payment_provider_display()
        writer.writerow(d_dict)

To Do/Questions:

  • Gerald Gorman - 8/1/21 donation pending?
  • [ ]
  • [ ]
  • [ ]
  • Import Memberships

Spam Emails? (may clear up after we import membership data)

Migration of donor/membership data from CL to Neon:

  1. All data from 10/6 (last migration done 10/5 - need to double check date so we don't miss anyone)
  2. simplify source field - monthly_donation_1234 all to monthly_donation from CL (look at other ones)
  3. identify members - should we make their last donation into a membership payment? Or Jan & Feb monthly donations to become membership payments?
  4. Custom field created to hold CL IDs on front end of each record (CL User ID) - need to import ALL CL IDs (should be an easy import)

Cool. We just did another round of imports:

  1. We made a two-column CSV to map CL user IDs to CL user IDs. This allowed us to populate a new custom field in Neon. It felt a bit dumb, but apparently you can have fields in Neon that have values used for mapping, but that can't be seen, so this makes it so we have a field for both of those purposes.

  2. We exported new accounts and added them to Neon using this:

    """
    Mappings were mostly fine from earlier screenshot, but we couldn't find
    the date_joined mapping, so we ignored it. Fine.
    
    Otherwise, pretty smooth.
    """
    
    
    import csv
    from datetime import datetime
    from cl.users.models import UserProfile
    up_skip_fields = [
        "username", "plaintext_preferred", "key_expires", "avatar",
        "docket_default_order_desc", "is_staff", "is_tester",
        "email_confirmed",
        "password", "activation_key", "last_login", "recap_email", "user_id",
        "is_active", "unlimited_docket_alerts", "is_superuser",
        "auto_subscribe",
        "notes", "_state", "stub_account",
    ]
    ups = (
        UserProfile.objects
        .filter(user__is_active=True, user__date_joined__gt='2023-10-06')
        .exclude(user__last_login__isnull=True)
        .exclude(user__first_name="")
        .exclude(user__last_name="")
        .exclude(user__first_name__contains="<")
        .distinct()
    )
    with open('accounts2.csv', 'w', newline='') as csvfile:
        for i, up in enumerate(ups.iterator()):
            if up is None or up.user is None:
                 continue
            if i == 0:
                fieldnames = set(list(up.__dict__.keys()) + list(up.user.__dict__.keys()) + ["CL User ID"])
                for field in up_skip_fields:
                    fieldnames.remove(field)
                writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
                writer.writeheader()
            up_dict = up.__dict__.copy()
            up_dict.update(up.user.__dict__)
            up_dict["CL User ID"] = up_dict["id"]
            for field in up_skip_fields:
                up_dict.pop(field)
            for k, v in up_dict.items():
                if isinstance(v, datetime):
                    up_dict[k] = v.strftime("%m-%d-%Y")
            writer.writerow(up_dict)
    

    It worked fine and imported about 3k accounts.

  3. We generated a CSV of donations using this:

    import csv
    from datetime import datetime
    from cl.donate.models import Donation
    d_skip_fields = [
        "_state", "payment_id", "transaction_id", "date_created", "date_modified",
    ]
    donations = (
        Donation.objects
        .exclude(referrer__icontains="XERO")
        .filter(date_created__gt="2023-10-06")
    )
    with open('donations2.csv', 'w', newline='') as csvfile:
        for i, d in enumerate(donations.iterator()):
            if i == 0:
                fieldnames = list(d.__dict__.keys())
                for field in d_skip_fields:
                    fieldnames.remove(field)
                writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
                writer.writeheader()
            d_dict = d.__dict__.copy()
            for field in d_skip_fields:
                d_dict.pop(field)
            # If date_donation is blank, set it to date created. This is needed
            # for credit cards that didn't go through and checks.
            if not d_dict['clearing_date']:
                d_dict['clearing_date'] = d.date_created
                
            # Clean up the referrer field
            referrer = d_dict["referrer"]
            if "monthly_donation" in referrer:
                d_dict["referrer"] = referrer.rsplit( '_', 1)[0]
            for k, v in d_dict.items():
                if isinstance(v, datetime):
                    d_dict[k] = v.strftime("%m-%d-%Y")
            d_dict["status"] = d.get_status_display()
            d_dict["payment_provider"] = d.get_payment_provider_display()
            writer.writerow(d_dict)
    

    Jenifer has that export and is going to remove some duplicates from it that she knows of and she will then import the file.

Next steps

  1. Membership info will come from Joe when he does his imports of our monthly donor info.
  2. Once that's done, we can decide if we want to do another donation export/import of the donations that happened between now and when Joe does his thing.
  3. Finally, we remove the Stripe code and monthly donation code from CL and consider the transition complete (woo!)

Unfortunately we will need to make one more pass at uploading data to create accounts. There are 71 donations that do not match to an account created in Neon. We limited the accounts that were added to Neon to those who had logged into their CL account.
This pass needs to include accounts where a donation was made, but they never logged into CL.

Error file from Neon: donations_round_two_02-09-2024_error.xlsx

Data fully migrated as of last download from CL.

That's a wrap then. Hooray! 🎉