This is a Python script for scraping data on U.S. mines, inspections, accidents, violations, penalties and contractors from the Department of Labor's enforcement site. The site provides a limited search interface, allowing users to search up to five states at a time, with limited sort, drill-down and export capabilities.
This script allows for all the data provided on the site to be downloaded in a more raw format.
Python >= 2.5 (If you are using an earlier version of Python and need to run this script, please contact me and I will try to assist you.)
The script can be run from the command line, as is, and will download data for every mine in every state: $ ./mines.py
This will, by default, create six CSV files in the current directory:
-
mines.csv - A list of all mines, along with the following information about each one:
- Mine ID
- Mine Name
- State
- County
- Mine Type
- Coal or Metal Mine
- Mine Status
- Mine Status Date
- Inspections
- Accidnets
- Violations
- Contractors
- Controller ID
- Controller Name
- Operator ID
- Operator Name
- Begin Date
- SIC
- SIC Description
-
inspections.csv - A list of all inspections, along with the following information about each one:
- Event No.
- Mine ID (relates to Mine ID in mines.csv)
- Violations
- Inspection Activity Code
- Inspection Activity Code Description
- Inspection Begin Date
- Inspection End Date
-
accidents.csv - A list of all mine accidents, along with the following information about each one:
- Mine ID
- Subunit
- Subunit Description
- Accident Date
- Degree of Injury
- Accident Classification Description
- Occupation Code Description
- Miner Activity
- Total Experience
- Mine Experience
- Job Experience
- Accident Narrative
-
violations.csv - A list of all mine violations, along with the following information about each one:
- Violation No.
- Violation Assessed?
- Event No. (relates to Event No. in inspections.csv)
- Date Issued
- S&S Designation
- 30 CFR
- Section of Act
- Type of Issuance
- Date Terminated
-
assessments.csv - A list of all penalties proposed and assessed against mines, along with the following information about each one:
- Violation No. (relates to Violation No. in violations.csv)
- Violator ID
- Violator Name
- Proposed Penalty Amount
- Current Assessment Amount
- Paid Proposed Penalty Amount
- Last Action Code
-
contractors.csv - A list of all mine contractors, along with the following information about each one:
- Mine ID (relates to Mine ID in mines.csv)
- Contractor ID
- Contractor Name
- Contractor Start Date
- Contractor End Date
See the Mine Safety and Health Administration's data dictionary for more detailed data definitions.
Please note: This script can take a long time to run. If you're interested in downloading all the data available, I would suggest running separated instances of the script on several different computers, each responsible for a state or set of states.
Other usage scenarios:
-
Download data for a single state:
from minescraper.mines import MineScraper
scraper = MineScraper('mines.csv', 'inspections.csv', 'accidents.csv', 'contractors.csv', 'violations.csv', 'assessments.csv')
scraper.write_headers()
scraper.scrape('WV') # Just download data for West Virginia
-
Get only a list of mines (and the rest of the data in mines.csv) for a state:
from minescraper.mines import MineScraper
scraper = MineScraper('mines.csv', 'inspections.csv', 'accidents.csv', 'contractors.csv', 'violations.csv', 'assessments.csv', False, False, False, False, False)
scraper.write_headers()
scraper.scrape('WV')
-
Get all data for active mines in a single state:
from minescraper.mines import MineScraper
scraper = MineScraper('mines.csv', 'inspections.csv', 'accidents.csv', 'contractors.csv', 'violations.csv', 'assessments.csv', False, False, False, False, False, True)
scraper.write_headers()
scraper.scrape('WV')
Two-clause BSD. See LICENSE
The Mine Safety and Health Administration has released other mine data here