Methods, materials and notes related to extracting dam removal attributes from published text documents for the Dam Removal Information Portal (DRIP).
app
Application with a command line interface to perform text mining over documents.
document-classification
Contains a web application that serves a document classification model to make predictions based on 3 categories: abiotic, biotic and abiotic&biotic.
citations
Notes for extracting structured citations from PDFs using grobid software.
preprocessing
Notes and documentation for batch converting PDFs to text for development using pdftotext.
Under USGS Software Release Policy, the software codes here are considered preliminary, not released officially, and posted to this repo for informal sharing among colleagues.
This software is preliminary or provisional and is subject to revision. It is being provided to meet the need for timely best science. The software has not received final approval by the U.S. Geological Survey (USGS). No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty. The software is provided on the condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from the authorized or unauthorized use of the software.