/Citrullination

Repository for code and datafiles associated with Nature Communications Manuscript

Primary LanguageJupyter Notebook

Citrullination ETL

Sankey

Repository for code and datafiles associated with Nature Communications Manuscript (citation to come)

Arcplots

The python notebooks provided for the arcplots directory only serve to generate .gml files. These files are then employed by the .R script to generate the final output. In order to get started and try this on your own, read below.

In order to recreate the arcplots, users must ensure they have python >= 3.8 and R >= 4.0.3 installed and operational on their local machine. Jupyter Notebooks are used often, so please install according to the following instructions: https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/ To install python requirements: locate 'Requirements.txt', press 'Shift+right_click', open command line, type: pip install -r (pasted path). Your final line should look like: pip install -r "C:/some/path/to/Requirements.txt"

  1. Open the python notebooks (.ipynb) titled "Build_xxxx_gml"
  2. Run all cells in the notebook. This will create a xxxx.gml.txt file which contains the information needed for the R script.
  3. Open the corresponding "xxxxArcplot.R" script in R studio.
  4. ****You must now direct the script to your newly create .gml.txt file. Locate the new .gml.txt file, press "Shift+righ_click" and select 'copy as path.' Paste this into the R script in line 8. Line 8 should now read: mis_file = "C:/some/path/to/file.gml.txt" Make sure the path has all forward slashes, otherwise there may be an issue running the sctipt.
  5. Run the script. If all packages are installed correctly, you will be provided with the output "xxxxArcplot" which can be exported as .svg.

Sankey Diagram

This script creates a sankey diagram from the Holoviews library. If there is interest in having this converted to a notebook, please contact me and I can push it to the repo.

To recreate this on your own, read below:

  1. Locate the 'Sankey.py' file.
  2. "Shift+righ_click" and select 'copy as path.'
  3. In your command line, type: python + (my pasted path)
  4. Your terminal should now read: python "C:/some/path/to/sankey.py"
  5. Press 'Enter' to run
  6. This will save a 'sankey.png' and 'sankey.svg' in the same directory as the 'sankey.py' file
  7. ***Note: Since the sankey diagrams lists overlapping values, we have used a vector editing software to replace the text and numbers with the numbers returned in the terminal

Stacked Bars

This is the most straight forward component of the repository. These notebooks have been built with Binder, so you should be able to run them in the browser. Otherwise you can clone/download the repo and follow these instructions.

  1. Open the python notebooks (.ipynb) titled "xxxx_stacked_bar_chart.ipynb"
  2. Run all cells in the notebook.
  3. This will save a '.svg' copy of the files to the same directory as the '.ipynb' files.