law-journal-backfiles
Preparing Law Journal Backfiles for Upload to a Digital Commons Repository
Introduction
There is a lot of work that goes into preparing the .csv
file.
Workflow
- Run
title_clean.py
on a text document containing the title fields from Hein's.csv
file- to change the way Hein has written the file names (ending in ", A", ", An", and ", The")
- In Hein's
.csv
, authors column - text to columns by semicolon (for separating multiple authors)- insert at least three columns to the right of "Author" or the data will overwrite
- text to columns for each author column by comma (insert one column to the right)
- use excel trim() function to remove left-most space
- there must be an easier way to do this in python ??
- text to columns for mname and suffix, delimited by space (with two columns to the right)
- arrange so as to match fname | mname | lname | suffix pattern
- setting column width to 15 makes it a little easier to handle
- Run
fpage.py
- Download copy of the .xls sheet used for uploading from the bepress manage submissions site and save it as journ_clean.xls
- Make a copy of this file and name it journ_prep.xls (you'll eventually copy and paste from journ_prep.csv into this file and use it as your base of operations for individual issue batch uploads)
- Add entries to the fulltext_url field using an Excel concatenation function - add directory as prefix, use XX
- Sort