/law-journal-backfiles

Preparing Law Journal Backfiles for Upload to a Digital Commons Repository

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

law-journal-backfiles

Preparing Law Journal Backfiles for Upload to a Digital Commons Repository

Introduction

There is a lot of work that goes into preparing the .csv file.

Workflow

  1. Run title_clean.pyon a text document containing the title fields from Hein's .csv file
    • to change the way Hein has written the file names (ending in ", A", ", An", and ", The")
  2. In Hein's .csv, authors column - text to columns by semicolon (for separating multiple authors)
    • insert at least three columns to the right of "Author" or the data will overwrite
    • text to columns for each author column by comma (insert one column to the right)
    • use excel trim() function to remove left-most space
      • there must be an easier way to do this in python ??
    • text to columns for mname and suffix, delimited by space (with two columns to the right)
    • arrange so as to match fname | mname | lname | suffix pattern
    • setting column width to 15 makes it a little easier to handle
  3. Run fpage.py
  4. Download copy of the .xls sheet used for uploading from the bepress manage submissions site and save it as journ_clean.xls
  5. Make a copy of this file and name it journ_prep.xls (you'll eventually copy and paste from journ_prep.csv into this file and use it as your base of operations for individual issue batch uploads)
  6. Add entries to the fulltext_url field using an Excel concatenation function - add directory as prefix, use XX
  7. Sort