/author-attrib

Authorship attribution using Constituency Parse Tree Paths

Primary LanguageJupyter Notebook

Overview

  1. Install dependencies: INSTALLING.md
  2. Process dataset: DataFuncs.py
  3. Extract Features:
  4. Run example experiments. run_all.py

Directory Tree

https://github.com/rvente/NLP-Final-Project/blob/release/Code/orpheus
├── analysis Use these notebooks for analysis, reading and writing to /results .
│   ├── analysis.ipynb
│   ├── chart_nb_x_alpha.ipynb
│   ├── chart_prev_seen.ipynb
│   ├── chart_svc_prev_seen.ipynb
│   ├── generate_charts.ipynb
│   ├── Presentation.ipynb
│   ├── Presentation-NB.ipynb
│   └── Presentation-SVC.ipynb
├── data Store data and feature extraction output here.
│   ├── 1000A30D__doc+pos.pkl
│   ├── 1000A30D_with_doc.pkl
│   ├── 100A50D.csv
│   ├── 100A50D__doc+pos.pkl
│   ├── 100A50D_POS.pkl
│   ├── DataFuncs.py
│   ├── Run_All.py
│   ├── ...
│   ├── small_with_doc.pkl
│   └── small.xlsx
├── experimentation Configure and run the machine learning models
│   ├── l0_100a_50d.py
│   ├── __pycache__
│   ├── run_all.py Outlines the most general combinations of hyper-parameters.
│   ├── run_prev_seen.py
│   ├── sandbox.py
│   └── svc.py
├── feature_extraction
│   ├── add_parse_tree.py
│   ├── add_path_features.py
│   ├── instance_parser.py
│   └── __pycache__
├── figures Figures generated by the analysis scripts.
│   ├── nb_x_alpha.pdf
│   ├── nb_x_alpha.svg
│   ├── nb_x_prev_seen.pdf
│   └── svm_x_prev_seen.pdf
├── INSTALLING.md How to install and configure
├── logs gitignored: The filesystem database of experiments
│   ├── 1
│   ├── 10
│   ├── 100
│   ├── 101
│   ├── ...
│   └── _sources
├── prev_seen_logs not gitignored: view sample logs here on another branch
│   ├── 1
│   ├── 10
│   ├── 11
│   ├── ...
│   └── _sources
├── INSTALLING.md
├── requirements_2.txt
├── requirements.txt
├── results
│   ├── nb_df_acc.pkl
│   ├── nb_df_f1.pkl
│   ├── nb_x_alpha_df_acc.pkl
│   ├── svc_df_acc.pkl
│   ├── svc_df_f1.pkl
│   └── svm_x_prev_seen.pkl
├── software_citations.bib
└── virtualenv We recommend a virtual environment for installing packages.