/paper-parser

PaperParser is a python package for extracting synthesis and performance metrics from academic articles on perovskite solar cells.

Primary LanguageJupyter NotebookMIT LicenseMIT

License: MIT Build Status spaCy

PaperParser

PaperParser is a python package for extracting synthesis and performance metrics from academic articles on perovskite solar cells. The long-term goal of this package is to provide a means to (1) scrape, (2) summarize, and (3) compare the relationships between synthesis procedure and device performance across perovskite literature.

Overview

How It Works

Flowchart for PaperParser workflow

Output

The result is a relational graph like the example below,

graph

implemented in python as nested dictionaries and lists.

Installation

The simplest way to run the example notebooks is to clone the git repo to your local machine. To install paperparser and its dependencies, we recommend the following procedure:

  1. Clone the git repository to your local machine.

  2. Create a new conda environment by running the following command in your terminal.

    conda create -n your_new_env python=3.6

    (Note: PaperParser was designed in Python 3.6, but also works with 3.5.)

  3. Activate your new, clean conda environment.

    conda activate your_new_env
  4. (Optional) For users of Git for Windows/Git Bash: run the following command.

    conda install -c conda-forge dawg

    Note that Linux, Mac, and WSL (Windows Subsystem for Linux) users can skip this step.

  5. Navigate to the top-level directory containing setup.py and pip install by running

    pip install .

    This will automatically install the dependencies required to run the package and the provided example notebooks. Make sure you are in the correct environment before running pip install!

  6. Download ChemDataExtractor's Data files. This step is important-- PaperParser will not run without this step.

    cde data download

Now you're ready to use PaperParser! If you're lost, be sure to check out the example notebook in ./examples/example_notebook.ipynb. Happy parsing!

Dependencies

PaperParser uses the following open-source packages in its implementation:

User Guide

An example of each tool that makes up paperparser is contained within the jupyter notebook examples/example_notebook.ipynb. This notebook should not require installation of paperparser if run in the original directory structure.